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In this paper, we study the approximation and estimation of s- 
concave densities via Renyi divergence. We hrst show that the ap¬ 
proximation of a probability measure Q by an s-concave density exists 
and is unique via the procedure of minimizing a divergence functional 
proposed by Koenker and Mizera (2010) if and only if Q admits full¬ 
dimensional support and a hrst moment. We also show continuity of 
the divergence functional in Q-. if Qn —^ Q in the Wasserstein metric, 
then the projected densities converge in weighted L\ metrics and uni¬ 
formly on closed subsets of the continuity set of the limit. Moreover, 
directional derivatives of the projected densities also enjoy local uni¬ 
form convergence. This contains both on-the-model and off-the-model 
situations, and entails strong consistency of the divergence estimator 
of an s-concave density under mild conditions. One interesting and 
important feature for the Renyi divergence estimator of an s-concave 
density is that the estimator is intrinsically related with the esti¬ 
mation of log-concave densities via maximum likelihood methods. In 
fact, we show that for d = 1 at least, the Renyi divergence estimators 
for s-concave densities converge to the maximum likelihood estimator 
of a log-concave density as s Z' Q. The Renyi divergence estimator 
shares similar characterizations as the MLE for log-concave distribu¬ 
tions, which allows us to develop pointwise asymptotic distribution 
theory assuming that the underlying density is s-concave. 
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1. Introduction. 


1.1. Overview. The class of s-concave densities on is defined by the 
generalized means of order s as follows. Let 


' {{1 - e)a^ + s/0,a,6>0, 



s < 0, a6 = 0, 
s = 0, 


a Ab, 


s = —oo. 


Then a density p{-) on is called s-concave, i.e. p € if and only if 
for all xo, xi E and 6 E (0,1), p((l — 6)xo + 9xi) > Ms{p{xq),p{xi)-, 9). 
This definition apparently goes back to Avriel (1972) with further studies by 
Borell (1974, 1975), Das Gupta (1976), Rinott (1976), and Uhrin (1984); see 
also Dharmadhikari and Joag-Dev (1988) for a nice summary. It is easy to 
see that the densities p{-) have the form p = for some concave function 
<y9 if s > 0, p = exp((p) for some concave (p if s = 0, and p = for some 


convex (p if s < 0. The function classes Vs are nested in s in that for every 
r > 0 > s, we have Vr CVo CVg C V-oo- 

Nonparametric estimation of s-concave densities has been under intense 
research efforts in recent years. In particular, much attention has been paid 
to estimation in the special case s = 0 which corresponds to all log-concave 
densities on M'^. The nonparametric maximum likelihood estimator (MLE) 
of a log-concave density was studied in the univariate setting by Walther 
(2002), Diimbgen and Rufibach (2009), Pal, Woodroofe and Meyer (2007); 
and in the multivariate setting by Cule, Samworth and Stewart (2010); Cule and Samworth 
(2010). The limiting distribution theory at fixed points when d = 1 was 
studied in Balabdaoui, Rufibach and Wellner (2009), and rate results in 
Doss and Wellner (2016); Kim and Samworth (2015). Diimbgen, Samworth and Schuhmacher 
(2011) also studied stability properties of the MLE projection of any prob¬ 
ability measure onto the class of log-concave densities. 

Compared with the well-studied log-concave densities (i.e. s = 0), much 
remains unknown concerning estimation and inference procedures for the 
larger classes Vs,s < 0. One important feature for this larger class is that 
the densities in Vs{s < 0) are allowed to have heavier and heavier tails 
as s —7> —oo. In fact, t—distributions with v degrees of freedom belong to 
P_i/(i/+i) (M) (and hence also to Vs(R) for any s < —l/iv -|- 1)). The study 
of maximum likelihood estimators (MLE’s in the following) for general s- 
concave densities in Seregin and Wellner (2010) shows that the MLE exists 
and is consistent for s E (—l,oo). However there is no known result about 
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uniqueness of the MLE of s-concave densities except for s = 0. The difficul¬ 
ties in the theory of estimation via MLE lie in the fact we have still very little 
knowledge of ‘good’ characterizations of the MLE in the s-concave setting. 

This has hindered further development of both theoretical and statistical 
properties of the estimation procedure. 

Some alternative approaches to estimation of s-concave densities have 
been proposed in the literature by using divergences other than the log- 
likelihood functional (Kullback-Leibler divergence in some sense). Koenker and Mizera 
(2010) proposed an alternative to maximum likelihood based on generalized 
Renyi entropies. Similar procedures were also proposed in parametric set¬ 
tings by Basu et al. (1998) using a family of discrepancy measures. In our 
setting of s-concave densities with s < 0, the methods of Koenker and Mizera 
(2010) can be formulated as follows. 

Given i.i.d. observations X = (Xi,..., X^), consider the primal optimiza¬ 
tion problem (V): 



(1.1) {V) 


where Q{^) denotes all non-negative closed convex functions supported on 
the convex set conv(X), ^ the empirical measure and /? = 

1 -|- 1/s < 0. As is shown by Koenker and Mizera (2010), the associated dual 
problem (V) is 



( 1 . 2 ) 


subject to / = —- \ for some G G G{20° 


where = [G G C*(A)| f g dG < 0, for all g G G{^)} is the polar cone 

of G(X), and a is the conjugate index of /3, i.e. l/a-|-l//3 = 1. Here C*(X), 
the space of signed Radon measures on conv(A), is the topological dual of 
C(X), the space of continuous functions on conv(A). We also note that the 
constraint G G G{]Q° in the dual form (1.2) comes from the ‘dual’ of the 
primal constraint g G G(X), and the constraint / = can be derived 

from the dual computation of L(-,Q„): 
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Here we used the notation {G,g) '■= f g dG, 'ipsi') ■= (O^/ l/^l 'I's is the 
functional defined by 'l's(fl') i= f ipsidix)) dx for clarity. Now the dual form 
(1.2) follows by the well known fact (e.g. Rockafellar (1971) Corollary 4A) 
that the form of the above dual functional is given by 


^*{G) 


f V’*(dG/dx) dx if G is absolute continuous with respect to 
< Lebesgue measure, 

^ +00 otherwise. 


For the primal problem {V) and the dual problem (P), Koenker and Mizera 
( 2010 ) proved the following results: 


Theorem 1.1 (Theorem 4.1, Koenker and Mizera (2010)). {V) admits 

a unique solution g^ if int(conv(A)) 7 ^ 0 , where 5 * is a polyhedral convex 
funetion supported on conv(X). 

Theorem 1.2 (Theorem 3.1, Koenker and Mizera (2010)). Strong du¬ 
ality between ifP) and iV) holds. Any dual feasible solution is aetually a 
density on with respect to the canonical Lebesgue measure. The dual op¬ 
timal solution f* exists, and satisfies f* = (gn)^^^. 


We note that the above results are all obtained in the empirical setting. At 
the population level, given a probability measure Q with suitable regularity 
conditions, consider 

(1.3) (Vq) mmLs{g,Q), 

g&G 

where 

L{g, Q) = Ls{g, Q) ^ [ g{x) dQ +-^ [ g{xf dx, 

J \P\ jRd 

and Q denotes the class of all (non-negative) closed convex functions with 
non-empty interior, which are coercive in the sense that g{x) 00 , as ||x|| —)• 
00 . Koenker and Mizera (2010) show that Fisher consistency holds at the 
population level: Suppose Q{A) := /o dA is defined for some /o = g^j^ 
where go ^ G] then go is an optimal solution for (Vq). 

Koenker and Mizera (2010) also proposed a general discretization scheme 
corresponding to the primal form ( 1 . 1 ) and the dual form ( 1 . 2 ) for fast 
computation, by which the one dimensional problem can be solved via lin¬ 
ear programming and the two dimensional problem via semi-definite pro¬ 
gramming. These have been implemented in the R package REBayes by 
Koenker and Mizera (2014). Koenker’s package depends in turn on the MOSEK 
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implementation of MOSEK ApS (2011); see Appendix B of Koenker and Mizera 
(2010) for further details. On the other hand, in the special case s = 0, com¬ 
putation of the MLE’s of log-concave densities has been implemented in the 
R package LogConcDEAD developed in Cule, Samworth and Stewart (2010) in 
arbitrary dimensions. However, expensive search for the proper triangulation 
of the support conv(X) renders computation difficult in high dimensions. 

In this paper, we show that the estimation procedure proposed by Koenker and Mizera 
(2010) is the ‘natural’ way to estimate s-concave densities. As a starting 
point, since the classes Vs are nested in s, it is natural to consider estima¬ 
tion of the extreme case s = 0 (the class of log-concave densities) as some 
kind of ‘limit’ of estimation of the larger class s < 0. As we will see, estima¬ 
tion of s-concave distributions via Renyi divergences is intrinsically related 
with the estimation of log-concave distributions via maximum likelihood 
methods. In fact we show that in the empirical setting in dimension 1, the 
Renyi divergence estimators converge to the maximum likelihood estimator 
for log-concave densities as s 0. 

We will show that the Renyi divergence estimators share characterization 
and stability properties similar to the analogous properties established in the 
log-concave setting by Diimbgen and Rufibach (2009); Cule and Samworth 
(2010) and Diimbgen, Samworth and Schuhmacher (2011). Once these prop¬ 
erties are available, further theoretical and statistical considerations in es¬ 
timation of s-concave densities become possible. In particular, the charac¬ 
terizations developed here enable us to overcome some of the difficulties of 
maximum likelihood estimators as proposed by Seregin and Wellner (2010), 
and to develop limit distribution theory at fixed points assuming that the 
underlying model is s-concave. The pointwise rate and limit distribution 
results follow a pattern similar to the corresponding results for the MLE’s 
in the log-concave setting obtained by Balabdaoui, Rufibach and Wellner 
(2009). This local point of view also underlines the results on global rates of 
convergence considered in Doss and Wellner (2016), showing that the diffi¬ 
culty of estimation for such densities with tails light or heavy, comes almost 
solely from the shape constraints, namely, the convexity-based constraints. 

The rest of the paper is organized as follows. In Section 2, we study the 
basic theoretical properties of the approximation/projection scheme defined 
by the procedure (1.3). In Section 3, we study the limit behavior of s-concave 
probability measures in the setting of weak convergence under dimension¬ 
ality conditions on the supports of the limiting sequence. In Section 4, we 
develop limiting distribution theory of the divergence estimator in dimen¬ 
sion 1 under curvature conditions with tools developed in Sections 2 and 3. 

Related issues and further problems are discussed in Section 5. Proofs are 
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given in Sections 6 and 7. 

1.2. Notation. In this paper, we denote the canonical Lebesgue measure 
on by A or and write H-Hp for the canonical Euclidean p-norm in 
and II'll = 11-112 unless otherwise specified. B{x, S) stands for the open ball of 
radius S centered at x in and 1a for the indicator function of ^4 C M'^. 
We use Lp{f) = \\f\\Lp = ||/||p = (/ l/rdArf)^/^ to denote the Lp(Arf) norm 
of a measurable function / on if no confusion arises. 

We write csupp((5) for the convex support of a measure Q dehned on 
i.e. 

csupp(Q) = nic’ : C C R'^ closed and convex, Q{C) = 1}. 

We let Qo denote all probability measures on R'^ whose convex support has 
non-void interior, while Qi denotes the set of all probability measures Q 
with hnite first moment: J’||x||(5(dx) < oo. 

We write fn -^d f if Pn converges weakly to P for the corresponding 
probability measures Pn{A) = P{-^) = 

We write a := 1 -|- s,/3 := 1 -b l/s,r := —1/s unless otherwise specified. 

2. Theoretical properties of the divergence estimator. In this 
section, we study the basic theoretical properties of the proposed projec¬ 
tion scheme via Renyi divergence (1.3). Starting from a given probability 
measure Q, we first show the existence and uniqueness of such projections 
via Renyi divergence under assumptions on the index s and Q. We will call 
such a projection the Renyi divergence estimator for the given probability 
measure Q in the following discussions. We next show that the projection 
scheme is continuous in Q in the following sense: if a sequence of probability 
measures Qn, for which the projections onto the class of s-concave densities 
exist, converge to a limiting probability measure Q in Wasserstein distance, 
then the corresponding projected densities converge in weighted Li metrics 
and uniformly on closed subsets of the continuity set of the limit. The direc¬ 
tional derivatives of such projected densities also converge uniformly in all 
directions in a local sense. We then turn our attention the explicit character¬ 
izations of the Renyi divergence estimators, especially in dimension 1. This 
helps in two ways. First, it helps to understand the continuity of the projec¬ 
tion scheme in the index s, i.e. answers affirmatively the question: For a given 
probability measure Q, does the Renyi divergence estimator converge to the 
log-concave projection as studied in Diimbgen, Samworth and Schuhmacher 
(2011) as s 0? Second, the explicit characterizations are exploited in the 
development of asymptotic distribution theory presented in Section 4. 
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2.1. Existence and uniqueness. For a given probability measure Q, let 
L{Q) = infL{g,Q). 

Lemma 2.1. Assume —l/((i + 1) < s < 0 and Q & Qo- Then L{Q) < oo 
if and only if Q £ Qi- 

Now we state our main theorem for the existence of Renyi divergence 
projection corresponding to a general measure Q on 

Theorem 2.2. Assume —l/{d+l) < s < 0 and Q G QoFiQi. Then (1.3) 
aehieves its nontrivial minimum for some g £ Q. Moreover, g is bounded 
away from zero, and f = g^^^ is a bounded density with respect to A^. 

The uniqueness of the solution follows immediately from the strict con¬ 
vexity of the functional L{-,Q). 

Lemma 2.3. g is the unique solution for (Vq) z/ int(dom( 5 )) ^ 0. 

Remark 2.4. By the above discussion, we conclude that the map Q i—?- 
argming^g L{g,Q) is well-defined for probability measures Q with suitable 
regularity conditions: in particular, if Q £ Qo and —l/(d -|- 1) < s < 0, it 
is well-defined if and only \i Q £ Qi. From now on we denote the optimal 
solution as gs{-\Q) or simply g{-\Q) if no confusion arises, and write Pq 
for the corresponding s-concave distribution, and say that Pq is the Renyi 
projection of Q to Pq £ Vs- 

2.2. Weighted global convergence in IHIli and H-Hoo- 

Theorem 2.5. Assume —l/(d -I- 1) < s < 0. Let {Qn} C Qo be a 
sequence of probability measures converging weakly to Q C Qo (1 Qi. Then 


( 2 . 1 ) 


/ 


x|| dQ < 



If we further assume that 


( 2 . 2 ) 



then. 


(2.3) 


L{Q) = lim L{Qn) 
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Conversely, if (2.3) holds, then (2.2) holds true. In the former case(i.e. (2.2) 


holds), let 

g ■= g{-\Q) and gn 

■=g{- 

\Qn), 

II 

, fn ■= gll'' satisfy 


lim fn{x) = 

rtri _ 

f{y), 

for 

ally eR^^/dif >0}, 

(2.4) 

/ L tLaJ 5 cl/ T U 

lim sup fn{x) < 

f{y), 

for 

all y E 



n^oo,x^y 





For K < r 

— d = —1/s — d. 





(2.5) 

lim {1 + 

n^oo J 

iNir 

\fn{x 

)-/(x)| dx = 

0 . 


For any closed set S contained in the continuity points of f and k < r. 


( 2 . 6 ) lim sup (l + ||x||)'' \fn{x) - f{x)\ = 0 . 

Furthermore, let Vf := {x E int(dom(/)) : / is differentiable at x}, and 
T C int(P/) he any compact set. Then 

(2.7) lim sup |V^/„(x) - V^/(x)| = 0 

where Vg/(x) := limfe^o 7(^+^0-/(^) denotes the (one-sided) directional 
derivative along f. 

Remark 2.6. The one-sided directional derivative for a convex function 
g is well-defined and V^g{x) = inffe>o ^ well-defined for 

/ = See Section 23 in Rockafellar (1997) for more details. 

As a direct consequence, we have the following result covering both on 
and off-the-model cases. 

Corollary 2.7. Assume —I/{d -|- 1) < s < 0. Let Q he a proba¬ 
bility measure such that Q £ Qo F Qi, with fq := g{-\QY^^ the density 
function corresponding to the Renyi projection Pq (as in Remark 2.4). Let 
Qn = ^ ^Xi be the empirical measure when Xi ,..., A„ are i.i.d. with 

distribution Q on M'^. Let gn := 5 '(-|Qn) and fn ■= gn^ be the Renyi diver¬ 
gence estimator of Q. Then, almost surely we have 

lim fn{x) = fqiy), for all y GR’^Xdff > 0}, 

n^oo^x^y 

lim sup fn{x) < fqiy), for all y E 

n^oo,x^y 


( 2 . 8 ) 
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For K < r — d = —1/s — d, 

(2.9) lim [{1 + \\x\\)^ fnix) - fgix) 


dx =a.s. 0 . 


For any closed set S contained in the continuity points of f and k < r, 


( 2 . 10 ) 


lim sup (l + ||x||)'" fn{x) - fqix) 


Furthermore, for any compact set T C int(P/g), 


( 2 . 11 ) 


lim sup 


^Jn{x) - V^/q(x) 


=o.., 0. 


=a.s. 0 . 


Now we return to the correctly specified case and relax the previous as¬ 
sumption that s > — l/(d-|-l) for the case of the empirical measure Qn = Qn 
and some measure Q with finite mean and bounded density / G Vs' C Vs 
with s' > s. 


Corollary 2.8. Assume —l/d<s<Q. Let Q be a probability measure 
on with density f G Vg if —l/{d + 1) < s and f G Vs' where s' > 
— l/(fi-|-l)} if s G {—l/d,—l/{d + l)\. (Thus f is bounded and f has a 
finite mean.) Let fn = fn,s be defined as in Corollary 2.1. Then (2.8), (2.9), 
(2.10), and (2.11) hold with fq replaced by f. 

2.3. Characterization of the Renyi divergence projection and estimator. 
We now develop characterizations for the Renyi divergence projection, es¬ 
pecially in dimension 1. All proofs for this subsection can be found in Ap¬ 
pendix 6.1. 

We note that the assumption —l/((i -|- 1) < s < 0 is imposed only for the 
existence and uniquess of the Renyi divergence projection. For the specific 
case of empirical measure Qn = Qru this condition can be relaxed to —1/d < 
s < 0. 

Now we give a variational characterization in the spirit of Theorem 2.2 in 
Diimbgen and Rufibach (2009). This result holds for all dimensions d > 1. 

Theorem 2.9. Assume —l/{d -|- 1) < s < 0 and Q G Qo Fi Qi. Then 
9 = 9{-\Q) if and only if 

(2.12) / h ■ dX< J h dQ, 

holds for all h : —)• R such that there exists to > 0 with g + th G Q holds 

for all t G (0, to)- 








5'-CONCAVE ESTIMATION 


11 


Corollary 2.10. Assume —l/((i + 1) < s < 0 and Q G Qo C Qi and 
let h he any closed convex function. Then 



where P = Pq is the Renyi projection of Q to Pq GVs (as in Remark 2.4). 

As a direct consequence, we have 

Corollary 2.11 (Moment Inequalities). Assume —l/{d-\- 1) < s < 0 
and Q G QqCiQi. Let //q := Eq[A]. Then pLp = /ig. Furthermore if — ll{d+ 
2) < s < 0, we have Amax(^p) ^ Ajiiax(AjQ) and Aniin(Ajp) ^ Aiiiin(SQ) where 
T:q is the covariance matrix defined by Sg := Eg[(A — /rg)(A — hq)'^]. 
Generally if — l/{d + k) < s < 0 for some /c G N, then Ep[||A||^] < Eg[||A||^] 
holds for all I = 1,... ,k. 

Now we restrict our attention to d = 1, and in the following we will give 
a full characterization of the Renyi divergence estimator. Suppose we ob¬ 
serve Xi,... ,Xn i.i.d. Q on M, and let X^i'^ < X^ 2 ) < • • • < ^{n) be the 
order statistics of Ai,...,A„. Let F„ be the empirical distribution func¬ 
tion corresponding to the empirical probability measure Qn '■= ^ Y(Ii=i ^Xi- 
Let Qn := g{-\Qn) and Fnit) := j!i^g(/\x) dx. From Theorem 4.1 in 
Koenker and Mizera (2010) it follows that gn is a convex function supported 
on and linear on for all i = 1,..., n — 1. For a con¬ 
tinuous piecewise linear function h : —)• M, define the set of knots 


to be 


Sn{h) :={t G (A(i),A(,)) : /i'(t-) h'(t+)} n {Ai,..., AJ. 


Theorem 2.12. Let gn be a convex function taking the value -|-oo on 
M \ [A(ip A(„)] and linear on [App Ap+i)] for all i = 1,... ,n — 1. Let 



Assume Fn{X(^n)) = 1- Then gn = gn if and only if 


(2.13) 
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Corollary 2.13. For xq € Sn{gn), we have 

Fn(2;o) - - < Fnixo) < F„(xo). 
n 

Finally we give a characterization of the Renyi divergence estimator in 
terms of distribntion function as Theorem 2.7 in Diimbgen, Samworth and Schuhmacher 
( 2011 ). 

Theorem 2.14. Assume — 1/2 < s < 0 and Q ^ Qo^~^Ql is a probability 
measure onM. with distribution function G{-). Let g G G be such that f = g^^^ 
is a density on M, with distribution function F{-). Then g = g{-\Q) if and 
only if 

1. j^{F-G){t)dt = Q; 

f-oo^P ~ G){t)6.t < 0 for all X with equality when x G S{g). 

Here S{g) := {x € M : g{x) < \{g{x + <5) + g{x — <5)) holds ford > 

0 small enough.}. 

The above theorem is useful for understanding the projected s-concave 
density given an arbitrary probability measure Q G Qo Cl Qi. The following 
example illustrates these projections and also gives some insight concerning 
the boundary properties of the class of s-concave densities. 


Example 2.15. Consider the class of densities Q defined by 


Q 


Qrix) = 


T — 1 

2(r - 2) 


1 + 


\x\ 


t-2 


,r>2 


Note that Qt is — 1/r-concave and not s-concave for any 0 > s > — 1 /t. We 
start from arbitrary q^- G Q with t > 2, and we will show in the following that 
the projection of Qr onto the class of s-concave (0 > s > — l/r) distribution 
through L{-,qT-) will be given by q-i/s- Let Qr be the distribution function 
of qri'), then we can calculate 






It is easy to check by direct calculation that {Qrif) — Qrit)) dt < 0 
with equality attained if and only if x = 0. It is clear that S{qr) = {0} and 
hence the conditions in Theorem 2.14 are verified. Note that, in Example 2.9 
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of Diimbgen, Samworth and Schuhmacher (2011), the log-concave approxi¬ 
mation of the rescaled t 2 density is the Laplace distribution. It is easy to 
see from the above calculation that the log-concave projection of the whole 
class Q will be the Laplace distribution q^o = ^exp(— |x|). Therefore the 
log-concave approximation fails to distinguish densities at least amongst the 
class Q U {12}- 


2.4. Continuity of the Renyi divergence estimator in s. Recall that a = 
1-t-s, and then a, /? is a conjugate pair with = 1 where f5 = l-|-l/s. 

For 1 — l/(i<a<l, let 

Faif) = ^-log [ /"(x) dx, 
a-1 J 

Fiif) = j /(x)log/(x) dx. 

For a given index —1/d < s < 0, and data X = {Xi, ... ,Xn) with non-void 
int(conv(X)), solving the dual problem (1.2) for the primal problem (1.1) is 
equivalent to solving 

(Va) minFaif) = - log f /“(x) dx 

(2.14) ^ «-i y 

subject to / = ^ for some G G G(20° 

dy 

where G(X)° is the polar cone of Q(20 and Qn = ^ Y17=i is the empirical 
measure. The maximum likelihood estimation of a log-concave density has 
dual form 


(2.15) 


(Pi) minFiif) = j /(x)log/(x) dx, 


subject to / = 


-G) 


dy 


for some G G ^(^)^ 


Let fa and fi be the solutions of (Pa) and (Pi). For simplicity we drop the 
explicit notational dependence of fa, f on n. Since Pq(/) —5- Pi(/) as a 1 
for / smooth enough, it is natural to expect some convergence property of 
fa to fi- The main result is summarized as follows. 


Theorem 2.16. Suppose d = 1. For all k > 0,p > 1, we have the 
following weighted convergence 
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Moreover, for any closed set S contained in the continuity points of f, 
lim sup (1 + ||x||)'' \fa{x) - fi{x)\ = 0 

for all n > 0. 

3. Limit behavior of s-concave densities. Let {/njnsN be a se¬ 
quence of s-concave densities with corresponding measures di^n = fndX. 

Suppose Vn -^d From Borell (1974, 1975) and Brascamp and Lieb (1976), 
we know that each is a t—concave measure with t = s/(l -|- sd) if 
— \/d < s < oo, t = —oo if s = —1/d, and t = 1/d if s = oo. This re¬ 
sult is proved via different methods by Rinott (1976). Furthermore, if the 
dimension of the support of u is d, then it follows from Borell (1974), The¬ 
orem 2.2 that the limit measure n is t—concave and hence has a Lebesgue 
density with s = t/(l — td). Here we pursue this type of result in somewhat 
more detail. Our key dimensionality condition will be formulated in terms 
of the set C := {x G : lim inf /n(x) > 0}. We will show that if 

(Dl) Either dim(csupp(i^)) = d or dim(C') = d 

holds, then the limiting probability measure u admits an upper semi-continuous 
s-concave density on Furthermore, if a sequence of s-concave densities 
{fn} converges weakly to some density / (in the sense that the corresponding 
probability measures converge weakly), then / is s-concave, and fn converges 
to / in weighted Li metrics and uniformly on any closed set of continuity 
points of /. The directional derivatives of fn also converge uniformly in all 
directions in a local sense. 

In the following sections, we will not fully exploit the strength of the 
results we have obtained. The results obtained will be interesting in their 
own right, and careful readers will find them useful as technical support for 
Sections 2 and 4. 

3.1. Limit characterization via dimensionality condition. Note that C 
is a convex set. For a general convex set K, we follow the convention (see 
Rockafellar (1997)) that dimiF = dim(aff(iF)), where aff(iF) is the affine 
hull of iF. It is well known that the dimension of a convex set K is the max¬ 
imum of the dimensions of the various simplices included in K (cf. Theorem 
2.4, Rockafellar (1997)). 

We hrst extend several results in Kim and Samworth (2015) and Cule and Samworth 
(2010) from the log-concave setting to our s-concave setting. The proofs will 
all be deferred to Appendix 6.2. 

Lemma 3.1. Assume ( Dl ). Then csupp(i/) = C. 
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Lemma 3.2. Let {t'njneN be probability measures with upper semi-continuous 
s-concave densities {/n}neN such that Un ^ weakly as n —)• oo. Here u is 
a probability measure with density f. Then fn -^a.e. f, and f can be taken 
as f = cl(lim„/„) and hence upper semi-continuous s-concave. 

In many situations, uniform boundedness of a sequence of s-concave den¬ 
sities give rise to good stability and convergence property. 

Lemma 3.3. Assume —1/d < s < 0. Let {fn}neN be a sequence of s- 
concave densities on If dim C = d where C = {liminf„/„ > 0} as 
above, then sup„gj^||/n||oo < oo. 

Now we state one limit characterization theorem. 

Theorem 3.4. Assume —1/d < s < 0. Under either condition of (Dl), 
u is absolutely continuous with respect to Xd, with a version of the Radon- 
Nikodym derivative cl(lim„/„), which is an upper semi-continuous and an 
s-concave density on 

3.2. Modes of convergence. It is shown above that the weak conver¬ 
gence of s-concave probability measures implies almost everywhere point- 
wise convergence at the density level. In many applications, we wish differ¬ 
ent/stronger types of convergence. This subsection is devoted to the study 
of the following two types of convergence: 

1. Convergence in H-Hli metric; 

2. Convergence in H-Hoo metric. 

We start by investigating convergence property in H-Hli metric. 

Lemma 3.5. Assume —l/d < s < 0. Let ..., Vn,... be probability 
measures with upper semi-continuous s-concave densities /,/i, . 

such that Vn ^ V weakly as n ^ oo. Then there exists a,b > 0 such that 
fn{x) V f{x) < (a||x|| +hY^\ 

Once the existence of a suitable integrable envelope function is estab¬ 
lished, we conclude naturally by dominated convergence theorem that 

Theorem 3.6. Assume —l/d < s < 0. Let ..., Vn,... be probability 
measures with upper semi-continuous s-concave densities f,fi,...,fn,... 
such that V weakly as n ^ oo. Then for k < r — d, 
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Next we examine convergence of s-concave densities in H-Hoo norm. We 
denote g = , gn = fn unless otherwise specified. Since we have established 

pointwise convergence in Lemma 3.2, classical convex analysis guarantees 
that the convergence is uniform over compact sets in int(dom(/)). To es¬ 
tablish global uniform convergence result, we only need to control the tail 
behavior of the class of s-concave functions and the region near the boundary 
of /. This is accomplished via Lemmas 6.2 and 6.3. 

Theorem 3.7. Let z^, i^i,..., r'n,... be probability measures with upper 
semi-continuous s-concave densities /, /i, . such that —>■ z/ weakly 

as n ^ oo. Then for any closed set S contained in the continuity points of 
f and K < r = —1/s, 

lim sup (1 -b ||x||)'' |/n(x) - f{x)\ = 0. 

We note that no assumption on the index s is required here. 

3.3. Local convergence of directional derivatives. It is known in convex 
analysis that if a sequence of convex functions gn converges pointwise to 
g on an open convex set, then the subdifferential of gn also ‘converges’ to 
the subdifferential of g. If we further assume smoothness of gn, then local 
uniform convergence of the derivatives automatically follows. See Theorems 
24.5 and 25.7 in Rockafellar (1997) for precise statements. Here we pursue 
this issue at the level of transformed densities. 

Theorem 3.8. Let n,ni,... ,i'n,... be probability measures with upper 
semi-continuous s-concave densities /, /i, . such that z/„ —>■ z/ weakly 

as n ^ oo. Let Vj := {x G int(dom(/)) : / is differentiable at x}, and 
T C int(P/) be any compact set. Then 

lim sup |Vg/n(x) - V^/(x)| = 0. 

4. Limiting distribution theory of the divergence estimator. In 

this section we establish local asymptotic distribution theory of the diver¬ 
gence estimator /„ at a fixed point xq G M. Limit distribution theory in 
shape-constrained estimation was pioneered for monotone density and re¬ 
gression estimators by Prakasa Rao (1969), Brunk (1970), Wright (1981) 
and Groeneboom (1985). Groeneboom, Jongbloed and Wellner (2001) es¬ 
tablished pointwise limit theory for the MLE’s and LSE’s of a convex de¬ 
creasing density, and also treated pointwise limit theory estimation of a 
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convex regression function. Balabdaoui, Rufibach and Wellner (2009) estab¬ 
lished pointwise limit theorems for the MLEs of log-concave densities on 
M. On the other hand, for nonparametric estimation of s-concave densi¬ 
ties, asymptotic theory beyond the Hellinger consistency results for the 
MLE’s established by Seregin and Wellner (2010) has been non-existent. 
Doss and Wellner (2016) have shown in the case of d = 1 that the MLE’s 
have Hellinger convergence rates of order for each s G (—l,oo) 

(which includes the log-concave case s = 0). However, due at least in part 
to the lack of explicit characterizations of the MLE for s-concave classes, 
no results concerning limiting distributions of the MLE at hxed points are 
currently available. In the remainder of this section we formulate results of 
this type for the Renyi divergence estimators. These results are compara¬ 
ble to the pointwise limit distribution results for the MLE’s of log-concave 
densities obtained by Balabdaoui, Ruhbach and Wellner (2009). 

In the following, we will see how natural and strong characterizations 
developed in Section 2 help us to understand the limit behavior of the Renyi 
divergence estimator at a hxed point. For this purpose, we assume the true 
density /o = satishes the following: 

(Al). go ^ G and fo is an s-concave density on M, where —1 < s < 0; 

(A2). foixo) > 0; 

(A3), go is locally around xo for some k > 2. 

(A4). Let k := max{A; £ N : k > 2,gQ\xo) = 0, for all 2 < j < k — 
l,gll‘\xo) 7^ 0}, and A: = 2 if the above set is empty. Assume is 
continuous around xq. 

4.1. Limit distribution theory. Before we state the main results con¬ 
cerning the limit distribution theory for the Renyi divergence estimator, 
let us sketch the route by which the theory is developed. We hrst denote 
Fn{x) := X!^/n(t) dt, Hn{x) := jm^Fnit) dt and lHl„(x) := /(('^F„(t) dt. 
We also denote := 7T,(fc+2)/(2A:+i) _ [xo,xo + Due to 

the form of the characterizations obtained in Theorem 2.12, we dehne local 
processes at the level of integrated distribution functions as follows: 

Y((’'’(t) : = / (¥n{v)-¥n{xo) - f (^ ^^(u - xq)-^) du^j du; 

■ = rn f (Pniv) - F{xo) - f ( ^ - Xq)^) du^j du 

'^ln,XQ V Xq j=0 

+ Ant + Bn^ 
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fc+i 


fc+2 


wherein := ^2^+1 (F„(xo)-F„( xo)) and := n2fc+i (iJ„(xo)-IHI„(xo)) are 
defined so that by virtue of Theorem 2.12. Since we wish 

to derive asymptotic theory at the level of the underlying convex function, 
we modify the processes by 


Y: 


locmod 




(4.1) 


where 


(4.2) 


irlocmod 


(t): = 


fo{xo) 

fo{xo) 


/ / 

'^ln,XQ ^0 


^k,n,2iu)dudv, 


- Tr. 


^n,XQ ^0 


^k,n,2{'^)dudv. 


'^k,n,2iu) = 


1 


fo{xo) 

r 


fn{u) - 


-i^\xo) 


j=0 


r- 


{u - XoY 


+ 


{9n{u) - go{xo) - 5'o(xo)(u - Xo)) . 


50(xo) 

A direct calculation reveals that with r = — 1/s > 0, 
-r • Vn 


rrlocmod 


{t) = / / (5n(^i)-5o(xo)-(^^-xo)5o(xo) ) dudu+ 

50(xo) Ji„ Jxo V J 


Ant + Bn 


/o(xo) 

and hence 

n2MU [gn{xo + Snt) - 5o(xo) “ Snt5o(xo)) = (t), 

\t). 


(4.3) 


{g'nixo + Snt) - 5o(xo)) = 


It is clear from (4.1) that the order relationship YJ/‘^™°^(-) > ]HlJ/‘^™°‘^(-) is 
still valid for the modified processes. Now by tightness arguments, the limit 
process H of including its derivatives, exists uniquely, giving us the 

possibility of taking the limit in (4.3) as re —>■ 00. Finally we relate HI to the 
canonical process Hk defined in Theorem 4.1 by looking at their respective 
‘envelope’ functions Y and T/, where Y denotes the limit process of 
and Yk{t) = JqW{s) ds — Careful calculation of the limit of YJ/'’ and 
^fc,n,2 reveals that 


Y, 


locmod 


(t) —^d 


Vfo{xo) JO 


Wis) ds 


rgo\xo) k+2 
5o(xo)(fc + 2)! 


Now by the scaling property of Brownian motion, W{at) =ci ^/aW{t), we 
get the following theorem. 
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Theorem 4.1. 
(4.4) 


Under assumptions (A1)-(A4), we have 




and 

(4.5) 


/ 


n2fc+i {g'^{xo) - g'Q{xo)) J 


-^d 


R2W {fn{xo) - /o(a:o))^ 


n2fc+i {f^{xo) - foixo))) 


-^d 


Jo^Uo)g^\o^ 

r2fc/o(xo)'=(fc+2)! 


l/(2fc+l) \ 

( 0 ) 


gp*" ^(a^o) [gp^'Ha^o)] 


^/o(a:p)'' 1 [(fc+2)!] 


l/{2fe+l) 


( 0 ) 


^ ^ r/p(3:p)'=+lg^'‘\a:o) ^ ^ 


V 


gp(a;o)(A:+2)! J 

Mxo)'‘+^ {go"\^o)) 

gp(a:o)3 [(^+2)!] 


3x l/(2fc+l) 


Ff (0) 


where Hk is the unique lower envelope of the process Yk satisfying 

1. Hj.{t) < Yf^{t) for all t G M; 

( 2 ) 

2. is concave; 

{'2') 

3. Hk{t) = Yk{t) if the slope of decreases strictly at t. 


Remark 4.2. We note that the minus sign appearing in (4.4) is due 

(‘ 2 ') 

to the convexity of gn,go and the concavity of the limit process "^(0). 
The dependence of the constant appearing in the limit is optimal in view of 
Theorem 2.23 in Seregin and Wellner (2010). 


Remark 4.3. Assume —1 < s < 0 and k = 2. Let /o = exp((/7o) be a log- 
concave density where <^o ^ is the underlying concave function. Then 

/o is also s-concave. Let gs := /q = exp(—(/ jo/?*) be the underlying convex 
function when /q is viewed as an s-concave density. Then direct calculation 
yields that 

gf\xQ) = ;^gsixo) {‘foixof - rifoixo)) ■ 

(<2\ 

Hence the constant before RO) appearing in (4.5) becomes 

hixofifoixof fojxof |(^o(xo)| 

4!r 4! 

Note that the second term in the above display is exactly the constant in¬ 
volved in the limiting distribution when /o(xo) is estimated via the log- 
concave MLE, see (2.2), page 1305 in Balabdaoui, Rufibach and Wellner 
(2009). The first term is non-negative and hence illustrates the price we 
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need to pay by estimating a true log-concave density via the Renyi diver¬ 
gence estimator over a larger class of s-concave densities. We also note that 
the additional term vanishes as r —oo, or equivalently s 0. 


4.2. Estimation of the mode. We consider the estimation of the mode 
of an s-concave density /(•) defined by M{f) := inf{t G M : f{t) = 
sup„eK/(^)}- 


Theorem 4.4. 
(4.6) nV(2fc+i) 


Assume (Al)-(Af) hold. Then 
gQ{mQf{k + 2 )!^ 


[iTin - mo) -^d ^ 


r‘^fo{mo)go^\moy 


V(2fc+1) 


where rhn = M(/„),mo = M(/o). 


By Theorem 2.26 in Seregin and Wellner (2010), the dependence of the 
constant on local smoothness is optimal when k = 2. Here we show that this 
dependence is also optimal for k > 2. 

Consider a class of densities V dominated by the canonical Lebesgue mea¬ 
sure on Let T : P —)• M be any functional. For an increasing convex loss 
function /(•) on R+, we dehne the minimax risk as 

(4.7) Ri{n-,T,V) := inf supEpxn/( |t„(Xi,... ,X„) - T{p)\), 

tn pg-p 

where the infimum is taken over all possible estimators of T{p) based on 
Xi,... ,Xn. Our basic method of deriving minimax lower bound based on 
the following work in Jongbloed (2000). 


Theorem 4.5 (Theorem 1 Jongbloed (2000)). Let {pn} be a sequence 
of densities in V such that limsup^^go n/i^(p„,p) < for some density 
p GV. Then 


(4.8) 


inrinf_ Rl{n;T,{p,pr,}) _ ^ 

n^oo /(exp(-2r2)/4 • |T(p„) - T(p)|) 


For fixed g G Q and / := g^^^ = 9 let mg := M{g) be the mode of g. 
Consider a class of local perturbations of g: For every e > 0, dehne 


9eix) = < 


g{mo - ecfi + {x - mo + ecf)g'{mo - ecf) x G [mo - cq, mo - e) 
g{mo + e) + {x - mo - e)g'{mo -|- e) x G [mo - e, mo -|- e) 

^g{x) otherwise. 
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Here q is chosen so that is continuous at tuq — e. This construction of a 
perturbation class is also seen in Balabdaoui, Rufibach and Wellner (2009); 
Groeneboom, Jongbloed and Wellner (2001). By Taylor expansion at rriQ — e 
we can easily see = 3 + o(l) as e ^ 0. Since /e := g~'^ is not a density, we 
normalize it by fe{x) := . ■ Now is s-concave for each e > 0 with 

mode mo — e. 

The following result follows from direct calculation. For a proof, we refer 
to Appendix section 6.3 . 


Lemma 4.6. Assume (Al)-(A4)- Then 


h\fe,f)=Ck 


r‘^fimo){g^^'>{mo)y 


^2k+l 


+ 0(6^^+^), 


gimof 


where 

Ck = 


- 4 • 3^+2(2A: + 1)(3^+2 ^ ^2 ^ ^ _ 3) 


108(A:!)2(A; + 1)(A: + 2)(2A: + 1) 

+ {k + l){k + 2) ^27(3^*^+^ - 1) + 2 • 3‘^^{2k + l){2k{2k - 9) + 27) 

2k^{2k^ + 1) 

^ 3(A:!)2(A: + 1)(2A: + 1)' 

Theorem 4.7. For an s-concave density /o, let SCn^rifo) defined by 
5Cn,r(/o) := |/ : s-concave density,h?{f, ffi) < — 

Let mo = M(/o) he the mode of Jq. Suppose (A1)-(A4) hold. Then, 


supliminf inf sup Ef|T„ —M(/)|> 

.^>0 n^QO f^SCn.T 

where pk = (2(2fe + l)^fc)“^/(^^+^)/4. 


Pk 


r‘^foimo)go^\mofi 


Proof. Take /(x) = |x|. Let e = cn ^nd let 7 = ^ ’ 

fn ■= /cn-i/(2fe+i)- Then limsup^^oo /) = Applying Theo¬ 

rem 4.5, we find that 

lhninfnV(2fc+PR,(n;r,{/,/„}) > ^cexp . 

Now we choose c = {2{2k to conclude. □ 
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5. Discussion. We show in this paper that the class of s-concave den¬ 
sities can be approximated and estimated via Renyi divergences in a robust 
and stable way. We also develop local asymptotic distribution theory for 
the divergence estimator, which suggests that the convexity constraint is 
the main complexity within the class of s-concave densities regardless heavy 
tails. In the rest of this section, we will sketch some related problems and 
future research directions. 

5.1. Behavior of Renyi projection for generic measures Q when s < —l/{d+ 
1). We have considered in this paper two regions for the index s: (1) 
—l/{d + 1) < s < 0 and (2) —1/d < s < —l/{d + 1). In case (1), we 
showed that starting from a generic measure Q with the interior of its con¬ 
vex support non-void and a first moment, the R&yi projection through (1.3) 
exists and enjoys nice continuity properties that cover both on and off-the- 
model situations. In case (2), we showed that the R&yi projection for the 
empirical measure still enjoys such continuity properties when Q is a prob¬ 
ability measure corresponding to a true s-concave density with a finite first 
moment. 

It remains open to investigate the behavior of the Renyi projection in the 
region (2) for a generic measure Q. If Q does not admit a first moment, 
i.e. /||x|| dQ{x) = oo, then the first term in the functional (1.3) diverges 
for any candidate convex function. We conjecture that the Renyi divergence 
projection fails to exist in this case. We do not know if the Renyi projection 
exists when —1/d < s < —l/{d+ 1) and Q ^Vg but f\\x\\dQ{x) < oo. 

It should be mentioned that the MLEs for the classes Vg exist (for an in¬ 
creasingly large sample size n as s —1/d), and are Hellinger consistent for 
—1/d < s < 0 (cf. Seregin and Wellner (2010)). Moreover, it is known from 
Doss and Wellner (2016) that the MLE does not exist for s < —1/d. But 
we do not yet know any continuity properties of the Maximum Likelihood 
projection “off the model”. This leaves the interval —l/d<s<—l/(d-|-l) 
presently without a nicely stable nonparametric estimation procedure. See 
Koenker and Mizera (2010) pages 3008 and 3016 for some further discussion. 

5.2. Global rates of convergence for Renyi divergence estimators. Clas¬ 
sical empirical process theory relates the maximum likelihood estimators 
with Hellinger loss via ‘basic inequalities’ as coined in van de Geer (2000) 
and van der Vaart and Wellner (1996). This reduces the problem of global 
rates of convergence to the study of modulus of continuity of empirical pro¬ 
cess indexed by a suitable transformation of the function class of interest. 
We expect that similar ‘basic inequalities’ can be exploited to relate the 
Renyi divergence estimators to some divergence (not necessarily Hellinger 
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distance). We also expect some uniformity in the rates of convergence for 
the Renyi divergence estimators as observed by Kim and Samworth (2015) 
in the case of the MLEs for log-concave densities. 

5.3. Conjectures about the global rates in higher dimensions. It is now 

well-understood from the work of Doss and Wellner (2016) that the MLEs 
for s-concave densities(—1 < s < 0) and log-concave densities in dimension 
1 converge at rates no worse than Op(n“^/®) in Hellinger loss. In higher 
dimensions, Kim and Samworth (2015) provide an important lower bound 
on the bracketing entropy for a subclass of log-concave densities on the order 
of in Hellinger distance, and a matching upper bound up to 

logarithmic factors for d < 3. Lack of corresponding results in discrete convex 
geometry precludes further upper bounds beyond d = 3. If a matching upper 
bound can be achieved for d > 4 (with possible logarithmic losses), the rates 
of convergence in squared Hellinger distances become 

r2 = 0(n-i/('^-i)),d > 4 

(up to logarithmic factors). It is also worth mentioning that minimum con¬ 
trast estimator may well be rate inefficient in higher dimensions, as observed 
by Birge and Massart (1993) in another context with ‘trans-Donsker’ class 
of functions. Therefore it is also interesting to design sieved/regularized es¬ 
timator to achieve the efficient rates. 

5.4. Adaptive estimation of concave-transformed class of functions. The 

rates conjectured above are conservative in that they are derived from the 
global point of view. Erom a local perspective, adaptive estimation may be 
possible when the underlying function/density exhibits special structures. 
In fact, it is shown by Guntuboyina and Sen (2015) that in the univari¬ 
ate convex regression setting, if the underlying convex function is piecewise 
linear, then the rate of convergence for the global risk in the discrete I 2 
norm adapts to nearly parametric rate (up to logarithmic factors). It 

would be interesting to examine if same phenomenon can be observed for 
the MLEs/Renyi divergence estimators, and more generally for minimum 
contrast estimators of concave-transformed classes of functions. 

6. Proofs. 

6.1. Proofs for Section 2. 

Proof of Lemma 2.1. Let Q E Qi. Then by letting g{x) := ||x|| -|- 1, 
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we have 

f 1 f dx 

L{Q) < L{g,Q) = J {l + \\x\\) dQ + J ^ ^ 

by noting Q G Qi, and —/3 = —1 — 1/s > d. Now assume L{Q) < oo. If 
Q ^ Qi, i.e. /||x|| dQ = oo, then since for each g G G, we can find some 
a,b > 0 such that g{x) > a||x|| — b, we have 


L{g,Q) 


5 dQ + 1^ y 5^ dx > 



b) dQ = oo, 


a contradiction. This implies Q G Qi. 


□ 


Proof of Theorem 2.2. We note that L(Q) < oo by Lemma 2.1. Hence 
we can take a sequence {^njnsN C Q such that oo > Mq > L{gn, Q) \ L{Q) 
as n —>■ oo for some Mq > 0. Now we claim that, for all xq G int(csupp(Q)), 


( 6 . 1 ) 


supfif„(xo) < oo. 

nSN 


Denote = inf3,g]jjd g(„(x). First note, 

L{gn,Q) > J gndQ = J gnd{gn < gnixo)) dQ + j 5nl(5n > gnixo)) dQ 

= j {gn - g-nixo) +gnixo))l{gn < gnixo)) dQ + j 5nl(5n > g-nixo)) dQ 

> gn{xo) - {gn{xo) - en)Q{{gn{-) < 5'n(2:o)}). 

If gnixo) > Sn, then xq is not an interior point of the closed convex set 
{gn < gnixo)}, which implies Q({5n(-) < 5n(a^o)}) < hiQ,x), where /i(-, •) is 
dehned in Lemma 7.9. Hence, in this case, the above term is lower bounded 
by 


L{gn,Q) > gnixo) - ignixo) - €n)hiQ,Xo) > gnixo){l “ /i(Q,Xo)). 


This inequality also holds for gnixo) = Cnj which implies that 


gnixo) < 


Ljgni Q) 

1 - h(Q,xo) 


< 


Mq 

1 - h(Q,xo)' 


by the first statement of Lemma 7.9. Thus we verified (6.1). Now invoking 
Lemma 7.14, and we check conditions (Al)-(A2) as follows: (Al) follows 
by (6.1); (A2) follows by the choice of gn since sup^gpj L(5n, Q) < Mq. By 
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Lemma 7.13 we can find a subsequence {gn{k)}k&fi of {gn}neN, and a function 
g € G such that {x E : sup^gj^(x) < 00 } C dom(^), and 

lim gn(k){x) = g{y), for all 7 /E int(dom( 5 )), 

k^oo,x^y 

liminf gn(k){x) > g{y), for all y E 

k^oo,x^y 

Again for simplicity we assume that {gn} satisfies the above properties. We 
note that 

=ihi (/ +w /*” '**) 

f 1 

> lim inf / gn dQ + 7-- lim inf 

n—)-oo J p n—¥oo 

> j g dQ + J g>^ dx = L{g, Q) > L{Q), 

where the third line follows from Fatou’s lemma for the first term, and Fa- 
tou’s lemma and the fact that the boundary of a convex set has Lebesgue 
measure zero for the second term (Theorem 1.1, Lang (1986)). This es¬ 
tablishes L(g,Q) = L{Q), and hence g is the desired minimizer. Since 
g G G achieves its minimum, we may assume xq E Arg min 2 ,g]j|d 5 (x). If 
g{xo) = 0, since g has domain with non-empty interior, we can choose 
xi,...,Xd E dom( 5 ) such that {xo,...,Xrf} are in general position. Then 
by Lemma 7.15 we find L[g,Q) = 00 , a contradiction. This implies g must 
be bounded away from zero. 

For the last statement, since 5 is a minimizer of (1.3), and the fact that 
g is bounded away from zero, then L{g -|- c,Q) is well-defined for all |c| < 5 
with small h > 0 , and we must necessarily have ■^L{g+c, Q)\c=o = 0 . On the 

other hand it is easy to calculate that ^L{g + c,Q) = l-J {g{x) + cy ^ dx. 
This yields the desired result by noting /3 — l = l/s. □ 

Proof of Lemma 2.3. Let g, h be two minimizers for Vq. Since V’s(t) = 
l^x^ is strictly convex on [0, 00 ), L{t ■ g + {1 — t) ■ h,Q) is strictly convex in 
t E [0,1] unless g = h a.e. with respect to the canonical Lebesgue measure. 
We claim if two closed functions 5 , h agree a.e. with respect to the canonical 
Lebesgue measure, then it must agree everywhere, thus closing the argu¬ 
ment. It is easy to see int(dom 5 ) = int(dom/i). Since int(dom(( 7 )) / 0, we 
have ri(dom 5 f) = int(domg') = int(dom/i) = ri(dom/i). Also note that a con¬ 
vex function is continuous in the interior of its domain, and hence almost 
everywhere equality implies everywhere equality within the interior of the 
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domain, i.e. q\. . = h\. ,,, ... Now by Corollary 7.3.4 in Rockafellar 

’ •^lint(domp) lint(aom/i) 

(1997), and the closedness of g, h, we find that g = clg = clh = h. □ 

Proof of Theorem 2.5. To show (2.1), we use Skorohod’s theorem; 
since Qn -^d Q-, there exist random vectors ~ Qn and X ^ Q de¬ 
fined on a common probability space (n,;S,P) satisfying Xn -^a.s. X. Then 
by Patou’s lemma, we have J||x||dQ = 1E[||X||] < liminf^^oo 1E[||X„||] = 
liminf^^oo J\\x\\dQn. 

Assume (2.2). We first claim that 

(6.2) limsupL((5„) < L{g,Q) = L{Q). 

n—)-co 

Let gn{'),gi') be defined as in the statement of the theorem. Note that 
limsup„^^L( 5 t„,Q„) < lim^^oo L(fi(L)^ Q^) = L{g^^\Q). Here s-L) ig the 
Lipschitz approximation of g defined in Lemma 7.8, and the last equality 
follows from the moment convergence condition (2.2) by rewriting g^‘^\x) = 

i+ || *i^ | (1 + ll^ll); note the Lipschitz condition on implies boundedness 
of TfT^' construction of we know that if xq is a minimizer of 

g, then it is also a minimizer of g'D. This implies that the function class 
is bounded away from zero since g is bounded away from zero by 
Theorem 2.2, i.e. inf 3 ,g]Rd (x) > cq holds for all e > 0 with some cq > 0. 

Now let e 0, in view of Lemma 7.8, by the monotone convergence theorem 
applied to g'Li and Cq — we have verified (6.2). 

Next, we claim that, for all xq G int(dom((5)), 

(6.3) limsup5n(xo) < oo. 

n^oo 

Denote = inf^-gjjd gn{x). Note by essentially the same argument as in the 
proof of Theorem 2.2, we have 

^ / L{Qn) 

9n{xo) < , ■ 

1 hiCjn 1 3^0 ) 

By taking lim sup as n —>■ oo, (6.3) follows by virtue of Lemma 7.9 and (6.2). 

Now we proceed to show (2.3) and (2.4). By invoking Lemma 7.14, we 
can easily check that all conditions are satisfied (note we also used (6.2) 
here). Thus we can find a subsequence {s'n(fc)}fceN of {^nlnsN with gn{k){x) > 
a||x|| — 6, holds for all x G and all A; G N with some a,b > 0. Hence by 
Lemma 7.13, we can find a function g £ Q such that {x G : limsup;i.^oo gn(k){x) < 
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00 } C dom( 5 ), and that 

, lim gn(k){x) = g{y), for all y E int(dom( 5 )), 

k^co,x^y 

liminf gn(k){x) > g{y), for all y E W’-. 

k^oo,x^y 


Again for simplicity we assume {gn} admit the above properties. Now define 
random variables = gn{Xn) — (a||X„|| — h). Then by the same reasoning 
as in the proof of Theorem 2.2, we have 


liminf L(Q„) = liminf ( / dQ„ + t 4- / da:) 

n^oo n^oo \[i\ J J 


> liminf+ a{Xn) — &] + / g^ dx 




> 


E[liminf//„]+oliminf / ||x|| dQ^ — ^ + 7777 / g^ dx 
n^oo n—^oo J p J 

I^{g:Q)+ai liminf / ||x|| dQn — \\x\\ dQ 


> L{Q) + a^liminf J ||x|| dQn — J ||a:|| dQ^ , 


Note the expectation is taken with respect to the probability space (n,i3,P) 
defined above. This establishes that if (2.2) holds true, then 


(6.4) liminfL(Q„) > L{g,Q) > L{Q). 

n^oo 

Conversely, if (2.2) does not hold true, then there exists a subsequence 
{Qn(k)} such that liminffc^oo /||a:|| dQn(k) > /I|t|| dQ. However, this means 
that liminffc^oo .b(Q„(^.)) > L{Q), which contradicts (2.3). Hence if (2.3) 
holds, then (2.2) holds true. Combine (6.4) and (6.2), and by virtue of 
Lemma 2.3, we find g ^ g. This completes the proof for (2.3) and (2.4). 

We show (2.5). First we claim that {xn E Argmin 2 ,g]gd g(„(x)}„gN is 
bounded. If not, then we can find a subsequence such that ||a:„(fc)|| —00 as 
A: —>• 00. However this means that 5 („(fc)(x) > gn(k){xn{k)) > CL\\xn{k)\\~b ^ 00 
as A: —?• 00 for any x, a contradiction. Next we claim that there exists cq > 0 
such that inffcgj!^ ^n{k) ^ ^0 holds for some subsequence {en(fc)}A:eN of {^nlneN- 
This can be seen as follows: Boundedness of {xn} implies Xn(k) x* as 
A; —>■ 00 for some subsequence {Tn(fc)}fceN C {xnjneN and some x* E M. 
Hence by (2.4) we have limsup^^^/„(fc)(x„(fc)) < f{x*) < 00, since /(•) is 
bounded. This implies that sup^gpj||/„(fc) ||oo < 00, which is equivalent to the 
claim. As before, we will understand the notation for whole sequence as a 
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suitable subsequence. Now we have gn{x) > (a||3:|| — 6) V eo holds for all 
X G This gives rise to 

/ \ i/s 

(6.5) fnix) < ( (a||a^|| — &) V cq j , for all x G M'^. 

Note that —l/{d + 1) < s < 0 implies 1/s < —(d + 1), whence we get 
an integrable envelope. Now a simple application of dominated convergence 
theorem yields the desired result (2.5), in view of the fact that the boundary 
of a convex set has Lebesgue measure zero (cf. Theorem 1.1 in Lang (1986)). 

Finally, (2.6) and (2.7) are direct results of Theorems 3.7 and 3.8 by noting 
that (2.5) entails /„ f (in the sense that the corresponding probability 
measures converge weakly). □ 

Proof of Corollary 2.7. It is known by Varadarajan’s theorem (cf. 
Dudley (2002) Theorem 11.4.1), converges weakly to Q with probabil¬ 
ity 1. Further by the strong law of large numbers (SLLN), we know that 
/||x|| dQn -^a.s. /ll^^lldQ. This verifies all conditions required in Theorem 
2.5. □ 

Proof of Corollary 2.8. The conclusion follows from Corollary 2.7 
if — l/((i + 1) < s < 0, so suppose —1/d < s < —l/{d + l). Since / G 
Vs', we may write / = where g is convex. If / is unbounded, then 
g{xQ) = 0 for some xq G M. By Lemma 7.15 with r' = —I/s', it follows 
that J f = oo, contradicting the fact that / is a density. Thus / must 
necessarily be bounded. To see that / has a finite mean, note that by Lemma 
3.5 /(x) = (6-|-a||x||)^/®' where a,b > 0 and r' = —1/s' > d + 1. Thus 
lud \\x\\f{x)dx < ||3^||(& + r||3;||)~'’ dx < oo. Now note that (2.8) holds by 

the existence of the Renyi divergence estimator for the empirical measure 
(cf. Theorem 4.1 in Koenker and Mizera (2010)) and the same argument 
in the proof of Theorem 2.5. Also note that by the proof of Theorem 3.7, 
(2.8) would be enough to ensure (2.10). Since / is continuous on the interior 
of the domain, we see that (2.10) implies weak convergence: let Qn be the 
measures corresponding to /„. Then Qn Q weakly as n —>■ oo. Now the 
rest follows immediately from Theorems 3.6 and 3.8. □ 

Proof of Theorem 2.9. Denote L(-) := L{-,Q). We hrst claim: 
Claim, g = argmin^gg L{g) if and only if lim^^o > 0, holds for 

all /i : —>■ M such that there exists to > 0 with g + th £ Q holds for all 

t G (0,to)- 

To see this, we only have to show sufficiency. Now suppose g is not a 
minimizer of L(-). By Theorem 2.2 we know there exists g £ G such that 
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9 = 9{'\Q)- By convexity, we have that for any t > 0, L[g t{g — g)) < 
(1 — t)L{g) + tL{g). This implies that if welet h = g — g, and to = 1; then 

Lin + thy Us) < 1((1 _ t)L(g)+tm - Us)) = -t{L{g) - L(g)), 

and thus lim^^o < —{L[g) — L{g)^ < 0, where the strict in¬ 

equality follows from Lemma 2.3. This proves our claim. Now the theorem 
follows from simple calculation: 


0 < Im - ( L{g + th) - L{g) 



h • g^/^ dA, 


as desired. □ 

Proof of Corollary 2.10. Let g = g{-\Q). Then by Theorem 2.2 and 
Lemma 7.10, we find that there exists some a,b > 0 such that g{x) > 
a||x|| -I- b. Now take v G dh{0), i.e. h{x) > h{0) + v^x holds for all x G M'^. 
Hence for t > 0, we have 

g{x) + th{x) > a||x|| + b + t{h{0) + v^x) > (o — t||u||)||x|| -|- (6 -|- th{0)), 

which implies that g + th G Q for t > 0 small enough. Now the conclusion 
follows from the Theorem 2.9. □ 

Proof of Theorem 2.12. We first note that if F is a distribution func¬ 
tion for a probability measure supported on and/i : ^ 

M an absolutely continuous function, then integration by parts (Fubini’s the¬ 
orem) yields 

(6.6) f h dF = h{X^n)) ~ j h'{x)F{x) dx. 

First we assume gn = {jn- For hxed t G let hi be a convex 

function whose derivative is given by h'i{x) = — l(x < t). Now by Theorem 
2.9 we find that f hi dF„ = f hi dFn < f hi dF„. Plugging in (6.6) we find 
that Fn{x) dx < Fn(T) dx. For t G Sn{gn), let /12 be the function 
with derivative h 2 {x) = l(x < t). It is easy to see gn + th 2 is convex for 
t > 0 small enough, whence Theorem 2.9 is valid, thus giving the reverse 
direction of inequality. This shows the necessity. 

For sufficiency, assume gn satisfies (2.13). In view of the proof of Theorem 
2.9, we only have to show (2.12) holds for all function /i : M —)• M which is 
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linear on = 1 ,...,n — 1) and gn + th convex for t > 0 small 

enough. Since gn is a linear function between two consecutive knots, h must 
be convex between consecutive knots. This implies that the derivative of 
such an h can be written as h'{x) = with /? 2 , • • •, /?n 

satisfying < 0 if ^ Sn{gn)- Now again by (6.6) we have 



as desired. □ 


Proof of Corollary 2.13. This follows directly from the Theorem 
2.12 by noting for xi < xq < X 2 we have 


and 


X2 - 


1 


1 ^2 
~ XQ JXn 


1 


Fn{x) dx < 

xq X2 Xq 


fX2 

y 

JXQ 


Fn(x) dx, 


fXQ 


/ Fn{x) dx > — 

^0 J xi ^0 


•^1 J XI 


F„(x) dx. 


Now let xi ^ xo and X 2 \ xq we find that Fn{xo) < F„(xo) by right 
continuity and Fnixo) > F„(xo-) = F„(xo) - □ 

Proof of Theorem 2.14. The proof closely follows the proof of Theo¬ 
rem 2.7 of Diimbgen, Samworth and Schuhmacher (2011). For the reader’s 
convenience we give a full proof here. Let P denote the probability distri¬ 
bution corresponding to F. We hrst show necessity by assuming g = g{-\Q). 
By Corollary 2.10 applied to h{x) = ±x, we find by Fubini’s theorem that 


0 = / X d(Q - P)(x) = {F- G){t)dt 


which proves (1). Now we turn to (2). Since the map s e->■ (s —x)+ is convex, 
again by Corollary 2.10, we find 

0< [ {s - x)+d{Q - P){s) =- 

Jr 

where in the last equality we used the proved fact that /[^(T — G)dA = 0. 
Now we assume x E <S{g), and discuss two different cases to conclude. If 


r {F-G){t) dt, 

J —OO 
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X € 9(dom(5f)), then let h{s) = —{s — x)+, it is easy to see g + th G G ior 
t > 0 small enough. Then by Theorem 2.9, we have 

h{s)d{Q - P){s) = r {F-G){t) dt. 

J —OO 

If X E int(dom( 5 ()), then g'{x — 5 ) < g'{x + 5 ) for small 5 > 0 by definition, 
and hence we define 

zjfr ^ _ g'{u)-g'{x - (5) ^ 

^ l{«e[x-5,x+5]} ^u>x+S}, 

whose integral Hs{s) := dtt serves as an approximation of — (s — 

x)+ as 5 \ 0. Note that 

u rsAix+S) 

{g+m)(s) = + {<,»-,iu-t{s 

implying g + tHg E G for t > 0 small enough (which may depend on 5). 
Then by Theorem 2.9, 

0 < [ Hs{s)d{Q - P){s) [{s- x)+d{Q - P){s) = r {F- G){t) dt, 

J J J —OO 



as (5 0, where the convergence follows easily from dominated convergence 

theorem. This proves (2). Now we show sufficiency by assuming (l)-(2). 
Consider a Lipschitz continuous function A(-) with Lipschitz constant L. 
Then 

j Ad{Q -P) = j -G) dX = - j{L- A'){F - G) dA 


I ^ l{5>A'(t)}dsJ {F - G){t)dt 

= - [ [ [F - G){t) dtds, 

J-L Ja{A',s) 

where the second line follows from (1), and j4(A',s) := {t E M : A'(f) < 
s}. Now replace the generic Lipschitz function A with g^^'^ as defined in 
Lemma 7.8 with Lipschitz constant L = 1/e. Note in this case A{^{g^‘^'^y , s) = 
{—oo,a{g,e)), where a{g,s) = min{t E M : g'{t+) > s} and hence a{g,s) E 
S{g). This implies that s) ~ G)(s)ds = 0 for all s E {—L,L) by 

(2), yielding that f g^^'> d{Q — P) = 0. Similarly we have f gjf^ d(Q — F) > 0 


(x+(5))_^. 




32 


HAN AND WELLNER 


where qq = g{-\Q)- Now let e 0, by monotone convergence theorem we 
find that f g dQ = f g dP and that f go dQ > J go dP. This yields 

L{go, Q) > L{go, P) > L{g, P) = L{g, Q), 

where the second inequality follows from the Fisher consistency of functional 
L(-, •) and the fact that P is the distribution corresponding to g. □ 

Before we prove Theorem 2.16, we will need an elementary lemma. 


Lemma 6.1. Fix a sequence 0 < < 1 with ^ 1. Let fa^ he an 

{an — 1)-concave density on M. Let ga„ := fan~^ underlying convex 

function. Suppose are linear on [a,b] with lim^^oo/«„ (a) = 7a £ 

[0,oo] and lim^^oo/«„ (&) = 76 £ [0,oo]. Then for all x G [a, 6], 

(6.7) fan {x) exp (x - a) + log 7 a^ 

where exp(—oo) := 0 and exp(oo) := oo. 


Proof of Lemma 6.1. First assume 7 ;, / ja and 7a,7b £ (0,oo). For 
notational convenience we drop explicit dependence on n and the limit is 
taken as a /' 1. Let 7a,« = /«(a) = and 'yb,a = fa{b) = 

ga{b)^^^°'~^'^. For any x G [a, 6], 


( 6 . 8 ) 

lim log fa (x) = lim-- log 

a^l CX — 1 


{x-a)+ 7 “ 


OL—l 

a 


= lim- 

a — 1 


log 


^“-1 _ 
lb,a ^a,a 

b — a 

lb la / \ lb,a la,a a-1 

(x — a) ■—j- r+7ar. 

^ fT—1 (y—1 la,a 


— a 


7r'-7a- 


= log 7a + liin log ({-if ^ - -if ^ • ra + 1 

a - 1 V \b-a) -ifa 


Since 7 “ q,^ —>■ 1, we claim that it suffices to show that 


(6.9) 


r„ = 




Ia,a 


lb ^ 


7 “-' 


1 as a 


To see this, assume without loss of generality that -fa > lb and hence if ^ — 
lf~^ > 0. Suppose that (6.9) holds and let e > 0. Then the second term on 
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right hand side of (6.8) can be bounded from above by 

= lim (log 7b • 7“"^ - log 7a • 7a "^) “ e) 

= (log 7b - log 7a) 7 ^—^(1 - e) 

[b-a) 

where the second line follows from L’Hospital’s rule. Similarly we can derive 
a lower bound: 

{x — a) 


(log 7b - log 7a)- 


■(l + e). 


(b-a) 

Thus it remains to show that (6.9) holds. But we can rewrite as 

^a—1 


- 1 


= 


c“-i - 1 

c“"nc«/c)""^ - icalc)°‘-^ + (c«/c)“"^ - 1 


na—l 




- 1 


1 “h 0 3jS Oi — y 1 


since log((cQ,/c)“ ^) = (a —1) log(cQ,/c) —)• 0-log 1 = 0, and where the second 
limit follows from an upper and lower bound argument using c^/c 1. 
where := 'yb,alla,a and c = -fb/la + 1- 

This shows that (6.9) holds, thereby proving the case for 7a 7b £ (0, oo). 
For the case 7b = 7a £ (0, oo), similarly we have 

1 /c"“^ - 1 

lim log /a(x) = log7a + lim-- log — -(x - a) + 1 

a^i a^i a — 1 \ b — a 

The second term is 0 by an argument much as above by observing Cq, = 
lb, aha,a Ib/la = 1- Finally, if 7a A 7b = 0, then by the first line of (6.8) 
we see that log /q(x) —)• —oo; if 7 a V 7 b = oo, then again log fah) —>- 00 . □ 


Proof of Theorem 2.16. In the following, the notation sup^, info,, limo, 
is understood as taking corresponding operation over a close to 1 unless 
otherwise specified. We first show almost everywhere convergence by invok¬ 
ing Lemma 7.13. To see this, for fixed sq G (—1/2,0), let ga '■= /q~^ and 
9a°'^ ■= {faY°- Then for a > H-sq, the transformed function is convex. 
We need to check two conditions in order to apply Lemma 7.13 as follows: 
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(Cl) The set (X(i),X(„)) C {limmf„/ q,(x) > 0}; 

(C2) There is a uniform lower bound function G Q such that > g^° 
holds for a sufficiently close to 1. 

The first assertion can be checked by using the characterization Theorem 
2.12. Let Fa be the distribution function of fa- Then — F„)(x) dx < 

0 with equality attained if and only if t € Sn{ga)- For x G close 

enough to we claim that liminfo/a(x) > 0. If not, we may assume 

without loss of generality that limQ faix) = 0. We first note that there exists 
some t G {l,---n — 1} and some subsequence {a(/3)}^gM with a(/3) ^ 1 
for which (1) is a knot point for {ga{0)}-, and (2) is not a knot 
point for any {ga{^)} for u > t + 1, i.e. ga(p)^ are linear on We 

drop /3 for notational simplicity and assume without loss of generality that 
both limits liiUalimaexist. Now Lemma 6.1 shows that 
minjlimo, limQ /ct(X(-j))} = 0 since we have assumed lima fa{x) = 0 

for some x G This in turn implies that limQ/Q,(x) = 0 for 

all X G Now we consider the following two cases to derive a 

contradiction with the fact 

f^(n) f^(n) 

(6.10) / Fa{x)dx = / F„(x)dx 

that follows from Theorem 2.12, thereby proving liminfQ/Q(x) > 0 for x 
close enough to . 

[Case 1.] If liuia fa{X(^n)) = 0) then the left hand side of (6.10) converges 
to X(^n) ~ ^(t) while the right hand side is no larger than 
[Case 2.]. If limQ /q,(X(„)) > 0, then we must necessarily have lima fa{x) = 
0 for all X G [X(i), X(„)) by convexity of ga'- If lima faixo) > 0 for some xq G 
[X(i),X(i)], then limagaixo) V fl'a(-’^(n)) < oo while lima5a(a^) = oo for all 
X G (X(j),X(„)), which is absurd. Note that this also forces lima fa{X(n)) = 
oo, otherwise the constraint J" /a = 1 will be invalid eventually. Now the left 
hand side of (6.10) converges to 0 while the right hand side is bounded from 
below by ^(X(„) - ^p)). 

Similarly we can show liminfa/o(a:) > 0 for x close to X^iy Now (Cl) 
follows by convexity of fa- 

(C2) can be seen by first noting M := supa||/a||oo < oo. This can be 
verified by Lemma 3.3 combined with the first assertion proved above. This 
implies that the class has a uniform lower bound . Now (C2) 

follows by noting that the domain of all 5 a is conv(X). Therefore all 
conditions needed for Lemma 7.13 are valid, and hence we can extract a 
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lim ga°\x) = for all y £ int(dom( 5 f^^°^)); 

n^oo,x^y ^ 

lim 9a°Hx) > g^''°\y), for all y G M"', 

n^oo,x^y 

holds for some g^^°^ G Q. This implies fa^ -^a.e. as n ^ oo where 

y(so) ._ Now repeat the above argument with another si with a 

further extracted subsequence {ari(fc)}j we see that -^a.e. —>■ oo) 

for some si-concave holds for the subsequence {a„(fc)}fcgis}. This implies 
that =a.e. Since a convex function is continuous in the interior of 

the domain, we can choose a version of upper semi-continuous / such that 
/ = a.e. for all {1/2 < s < 0} H Q. This implies that / is s-concave for 
any rational 1/2 < s < 0 and hence log-concave. Next we show weighted Li 
convergence: For fixed k > 0, choose 0 > sq > ~l/(^ + !)• Since there exists 
a, 6 > 0 such that > a||x|| — b holds for all n G N, we have an 

integrable envelope function: 

/ \ i/«o 

(l + \\x\\y{faAx) V /(x)) < (l + ||x||)''(^(a||x|| -b)v Mj 

Now an application of the dominated convergence theorem yields the desired 
weighted Li convergence. Similar arguments show weighted convergence is 
also valid in arbitrary Lp norms {p > 1). 

Finally we show that / = /i by virtue of Theorem 2.2 in Diimbgen and Rufibach 
(2009) and Theorem 2.9. We note that by Lemma 6.1, / must be log-linear 
between consecutive data points. Now since fi and / are both log-linear be¬ 
tween consecutive data points of {Xi, ..., X„}, we only have to consider test 
functions h such that h is piecewise linear on consecutive data points. Recall 
ga = g ~ fog / are the underlying convex functions for fa and 

/. For any such h with the property that, g + th ^ Q for t small enough, we 
wish to argue that such h is also a valid test for faif-e. ga + th ^ Q iov t > Q 
small enough), for a sequence of {ok} converging up to 1 as A: —)• oo. Thus 
we only have to argue that for all G S{g), G S{ga) for a sequence of 
{ak} going up to 1 as A; —>■ oo. Assume the contrary that ^ S{ga) for all 
a close enough to 1. Then {yaj’s are all linear on a closed interval I = [a, 6] 
containing for a close to 1. Since fa^f uniformly on I by Theorem 
3.7, in particular /o(a) and fa{b) converges, Lemma 6.1 entails that / is 
log-linear over /, a contradiction to the fact G S{g). Hence we can find 
a subsequence {ofe} going up to 1 as A: ^ oo such that for all G S{g), 
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X(j) G S{gaf,)-, i.e. for all feasible test function /i of /i, being linear on con¬ 
secutive data points, is also valid for fa^. Now combining the fact that fa^, 
converges in L 2 metric to / and Theorem 2.2 in Dtimbgen and Rufibach 
(2009) we conclude /i = /. □ 

6 .2. Proofs for Section 3. 


Proof of Lemma 3.1. The proof closely follows the first part of the 
proof of Proposition 2 Kim and Samworth (2015). Suppose dim (csupp(i^)) = 
d, we show csupp(z^) C C. To see this, we take xq ^ C, then there exists 
(5 > 0 such that B{xo,6) C C'^, and we claim that 


(6.11) For all x* G B{xo,6) C C^,x* ^ int(csupp(i/)). 


If (6.11) holds, then xq ^ csupp(z/) and hence csupp(z^) C C. Now we turn 
to show (6.11). Since x* ^ C = {liminf^^oo/n(3:) > 0}, we can find a 
subsequence {fn{k)}kei>i of {/n}neN such that fn{k){x*) < holds for all 
A: G N. Hence x* ^ P^ := {x G : fn{k){x) > Note that Pfc is a closed 
convex set, hence by Hyperplane Separation Theorem we can find bk G 
with ||6fc|| = 1 such that {x G : (6fc,x) < (6fc,x*)} C (Pfc)'^- Without loss 
of generality we may assume bk —>■ bx* as k ^ 00 for some bx* G with 
\\bx* II = 1. Now for fixed R > 0 and g > 0, define 

^R,r, ■= { 2 ; G IR"* : {bx*,x) < {bx*,x*) - rj, ||x|| < R}. 


Choose A:o G N large enough such that ||&fc ~ ^a:*|| < ^ holds for all k > 
kQ{x*,r], R). Now for R > ||x*|| and x G we have 

{bk,x- x*) = {bx*,x- X*) + {bk -bx*,x - x*) < -?? + ^(||x|| ||x*||) < 0 

holds for all k > ko{x*,7], R). This implies for R > ||x*|| and g > 0, 


C {x G : {bk,x) < {bk,x*)} C (P^)" = {x G : /„(*,)(x) < 
Now note Ar^^i is open, by Portmanteau Theorem we find that 


i'{AR,n) < liminf nn(k){^R,ri) = liminf 

k—^oo k—^00 



fn{k){x) dx < liminf = 


This implies 


z^({x G : {bx*,x) < {bx*,x*)}) = v 


U ^RMR 

R=\ 


lim v{Ar^ir) = 0, 
R-^oo ' 
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where the second equality follows from the fact is an increasing 

family as R increases. By the assumption that dim (csupp(i^)) = d, we find 
X* ^ int(csupp(i/)), as we claimed in (6.11). 

Now Suppose dimC = d, we claim C C csupp(z/). To see this, we only 
have to show C C csupp(i/) by the closedness of csupp(i/). Suppose not, 
then we can find xq ^ C \ csupp(z^). This implies that there exists J > 0 
such that B{xo,6) n csupp(j^) ^ 0. By the assumption that dim(7 = d, 
we can find xi,... ,Xd G B{xo, 5) CiC such that {xq, ..., Xd} are in general 
position. By definition of C we can find cq > 0, no G N such that fnixj) > eo 
for allj = 0,1,... , d and n > no- By convexity, we conclude that fnix) > 
eo, for all x G conv({xo,..., x^}) and n > uq. This gives 

n(conv({xo,.. • ,Td})) > limsup n„(conv({xo,.. • ,Td})) 

n^oo 

> eoArf(conv({xo,..., Xd})) > 0, 

a contradiction with B{xo,5) H csupp(n) / 0, thus completing the proof of 
the claim. To summarize, we have proved 

1. If dim (csupp(n)) = d, then csupp(n) C C. This in turn implies 
dimC = d, and hence C C csupp(n). Now it follows that csupp(n) = 

2. If dim C = d, then C C csupp(n). This in turn implies dim (csupp(n)) = 
d, and hence csupp(n) C C. Now it follows that csupp(n) = C. □ 

Proof of Lemma 3.2. The proof is essentially the same as the proof 
of Proposition 2 Cule and Samworth (2010) by exploiting convexity at the 
level of the underlying basic convex function so we shall omit it. □ 

Proof of Lemma 3.3. Set Un,t = {x G : /n(x) > t}. We first claim 
that there exists no G N, eo G (0,1) such that Xd{Un,eo) > eo holds for all 
n > uq. If not, then for all A: G N, / G N, there exists G N such that 
XdiUn^j,i/i) < j- Note that {hminf„/„ > 0} = nn>fcLn,i/z- Since 

U/eN n n>k Un,i/i) = OO Ad(n n>k Un,l/l) ^ Xd{Uni^ i^l/l) 

0, we find that C = {lim inf„ /„ > 0} is a countable union of null set and 
hence Xd{C) = 0, a contradiction to the assumption dim (7 = d. This shows 
the claim. 

Denote Mn := sup^-gj^d/^(x), e^ G Argmax/„(x). Without loss of gen¬ 
erality we assume where Ks = (1/2)® — 1 > 0, and we set 

Xn ■= G [0,1]. Now for x G m, by convexity of /® we have 

fn i^n + Xn{x — 6^)) < Xnfn{x) + {1 — Xn)fn{^n) < XneQ + {l —Xn)= {Mn/2)^. 
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This implies fn{x) > M„/2 := for all x G Vn^^o ■= {^n + — e„) : x G 

Un,eo}- Hence 14,^0 C and therefore ( 14 ,^ 0 ) = >^d{Un,eo)K^ thus 

UUn,nJ > Ad(K,.o) = UUn,eo)>^i > 


holds for all n > uq. On the other hand, 

1 = y fn> ^n^d{Un,Q.n) > ^nCoA^, 

and suppose the contrary that A4 —)> 00 as n —)• 00 , then 

A 


1 > 0„eoA — 


eoK 


2(6g - M*) 


^ 00 , n^oo, 

^ ^s\a ^ ’ 


since 1 + sd > 0 by assumption —l/d<s<0. Here c = This gives 

a contradiction and the proof is complete. □ 

Proof of Theorem 3.4. We only have to show u is absolutely contin¬ 
uous with respect to A^. To this end, for given e > 0, choose 5 = e/2M, 
where M := sup„||/n||oo < 00 by virtue of Lemma 3.3. Now for Borel set 
A C with A(i(4) < 5, we can take an open 4' D 4 such that \d{A') < 25 
by the regularity of Lebesgue measure. Then 

v{A) < v{A!) < liminf VniA') = liminf [ fn^ 25M = e, 

n^oo n^oo J 


as desired. 


□ 


Proof of Lemma 3.5. Let gn = fn and g = f^- Without loss of gen¬ 
erality we assume 0 G int(dom(g)), and choose g > 0 small enough such 
that Brj := B{0,g) C int(dom(g')). By the Lemma 7.10, we know there ex¬ 
ists a > 0,72 > 0 such that > a, holds for all ||x|| > Now 

we claim that there exists no G N such that > f, holds for all 

||x|| > R and n > uq. Note for each n G N, by convexity of gni'), we know 
that for fixed x G the quantity is non-decreasing in A, so 

we only have to show the claim for ||x|| = R and no > n. Suppose the con¬ 
trary, then we can find a subsequence {gn{k)} and = R such that 

9n(k)(^n(k)) gn(fc)(o) ^ a simplicity of notation we think of {gn},{xn} 

ll^n(fc) II ^ 

as { 9 n{k)} , {Xn{k)} ■ Now define An := conv({xn, H^}); := { 7 / G M"* : 

lly — Xn\\ < 72/2}; Cn '■= An n Bn- By reducing 7 / > 0 if necessary, we may 
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assume Brj n Bn = 0. It is easy to see Cn is convex and Xd{Cn) = Aq is a 
constant independent of n G N. By Lemma 3.2, we know that gn -^a.e. 9 
on Brj, and hence sup^jg^^ \gnix) — g{x)\ 0(n —)• oo) by Theorem 10.8, 

Rockafellar (1997). By further reducing r/ > 0 if necessary, we may assume 
9niy) < < 7 ( 0 ) + holds for all y ^ B^ and n G N. Now for any x* G Cn, 
write X* = Xxn + (1 — X)y, by noting R/2 < ||x*|| < R and convexity of gn, 
we get 


gn{x*) - gn(0) 
||x*|| 


Xgnjxn) + (1 - X)gn{y) - 3 ^( 0 ) 

||x*|| 

^ 9n{Xn) ~ g-ni^) ||^n|| ^ (?/) ~ <7n(0) 


a R 


+ (1-A) 


aR/8 

R/2 


a 

4' 


This gives rise to 

liminf [ (/„-/)> liminf Ao((aii’/4 + - (aii’/2 + 

n^oo Jq n^oo ' 

= Ao((aR/4 + 5(0))^/* - (aR/2 + 5 ( 0 ))^/'*) > 0, 

which is a contradiction to Lemma 7.16. This establishes our claim. Now 
by Lemma 3.2, we find that the set {liminf^/„(•) > 0} is full-dimensional, 
and hence by Lemma 3.3 we conclude (7n(') is uniformly bounded away from 
zero. Also note by Lemma 7.15 we find g{-) must be bounded away from 
zero, which gives the desired assertion. □ 

Before the proof of Theorem 3.7, we first state some useful lemmas that 
give good control of tails with local information of the s-concave densities; 
the proof can be found in Section 7.1. 

Lemma 6.2. Let xq, ... ,Xd he d + 1 points in sueh that its convex 
hull A = conv({xo, • • •, Xd]) is non-void. If f{y) < min^ (A Y'ixi)) 
then 

f{y) < /maxfl - - + -/minC'(l -k \\yf)^^A ■ 

\ r r J 

Here the constant C = Ad(A)((i-|-l)“^/^cJmax(-^)~^ where X = 
and /min := mino<j<d/(xj),/max := maxo<j<d/(xj). 
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Lemma 6.3. Let u be a probability measure with s-concave density f. 
Suppose that B{0, (5) C int(dom(/)) for some <5 > 0. Then for any y G 




sup f{x) < <^0 ... 

xeB(y,5t) \\J 0 ‘^d[B[ty,dt)) 

where Jq := inf^es(o,5) /(^) and 6t = <5^. 

Now we are in position to prove Theorem 3.7 




Proof of Theorem 3.7. That the sequence {/n}neN converges uniformly 
on any compact subset in int(dom(/)) follows directly from Lemma 3.2 and 
Theorem 10.8 Rockafellar (1997). Now we show that if / is continuous at 
y G with /(y) = 0, then for any y > 0 there exists 6 = 6{y,r]) such that 

(6.12) limsup sup fn{x) < rj. 

n^cc, xeB{y,5(y,ri)) 

Assume without loss of generality that B(0,6o) C int(dom(/)) for some 
6o > 0. Let Jo := inf^-g^^o.^o) /(®)' Then uniform convergence of {/«} to / 
over B{0,6o) entails that 


liminf inf fn{x) > Jo- 

n^oo xeB{0,5o) 


Hence with 6t = T follows from Lemma 6.3 that 


limsup sup fn{x) < Jo 
n^oo xeB{y,St) 


< Jo 


/1 / / v{B{ty,5t)) \ _ 

\t V \JoXd{B{ty,St))J 

Jq^"’ ( sup^eBjtyA) 


-(1-t) 


—r 


—r 

0 


as t 1. This completes the proof for (6.12). So far we have shown that 


lim sup \fn{x) - f{x)\ =0 
xe5nB(o,p) 

holds for every y > 0, where S is the closed set contained in the con¬ 
tinuity points of /. Our goal is to let y —oo and conclude. Let A = 
conv({xo,..., Xd}) be a non-void simplex with xo,...,Xd G int(dom(/)). 
Note first by a closer look at the proof of Lemma 3.5, fnix) V f{x) < 





5'-CONCAVE ESTIMATION 


41 


((o||x|| — holds for all x G with some a,b > 0. Let po := inf{p > 

0: [ap- 6)^/^ < /min/2} where /min := mino<j<rf/(x^) > 0. Then 

{x G : ||x|| > /3o} C Pi {fn < /min/2} Q}/ < /min/2} 

n>l 

c P {/n < < /min} 

n>no 

C p {/n < mm(i^/^(xO)^^*}P{/< mm(i^/^(xi))^''^}, 

n>no i^j i^j 

where no G N is a large constant. The second inclusion follows from the 
fact that lim„^oo fnixi) = f{xi) holds for z = 0,..., d. By Lemma 6.2 we 
conclude that 

limsup sup (1 + ||x||)'"(/n(x) V/(x)) 

n^oo x:\\x\\>pVpo 

< sup /max(l + ||t|| -1-/minC'(l + ||t|P ^ 0, 

a::||x|| >pVpo \ r T J 

as p —)• oo. This completes the proof. □ 


Proof of Theorem 3.8. Since Vj/n(x) = -rpn(x)^/® 


|V^/n(x)-V^/(x)| 


= r 


ignix) - g{xfl^SI^g{x) 

< ?■{ fnix) |V^ 5 n(x) - V^ 5 f(x)| + |/n(x) - /(x)| |V^p(x)| 


< 2rsup|/(x)| |V^Pn(x) - V^fi((x)| +rsup|/n(x) - /(x)| SUp||Vfi((x )||2 

xST xST xST 


holds for n large enough by Theorem 3.7. By Theorem 23.4 in Rockafellar 
(1997), V^gn{x) = TJ^ for some Tx G dgn{x) since dgn{x) is a closed set. 
Thus the hrst term above is further bounded by 


2rsup|/(x)| sup ||r - V 5 (x)|| 2 , 

x&T xST,T£9g„(x) 

which vanishes as n —)• oo in view of Lemma 3.10 in Seijo and Sen (2011). 
Note that Vp(-) is continuous on T by Corollary 25.5.1 in Rockafellar (1997), 
and hence sup 2 ,g 2 ’||Vp(x )||2 < oo. Now it is easy to see that the second term 
also vanishes as n —>■ oo by virtue of Theorem 3.7. □ 




42 


HAN AND WELLNER 


6.3. Proofs for Section 4- Before we prove Theorem 4.1, we will need the 
following tightness result. 


Theorem 6.4. We have the following conclusions. 

1. For fixed K > 0, the modified local process converges weakly 


to a drifted integrated Gaussian process on C[—K,K]: 



where W{-) is the standard two-sided Brownian motion starting from 


0 on M. 


2. The localized processes satisfy 


> 0 , 

with equality attained for all t such that xq + G S{gn). 

3. The sequences {An} and {Bn} are tight. 



The above theorem includes everything necessary in order to apply the 


‘invelope’ argument roughly indicated in Section 4.1. For a proof of this 
technical result, we refer the reader to Section 7.2. Here we will provide 
proofs for our main results. 

Proof of Theorem 4.1. By the same tightness and uniqueness argu¬ 
ment adopted in Groeneboom, Jongbloed and Wellner (2001), Balabdaoui and Wellner 
(2007), and Balabdaoui, Rufibach and Wellner (2009), we only have to find 
the rescaling constants. To this end we denote ]H[(-),Y(-) the correspond¬ 
ing limit of and YJ(''^™°‘^(-) in the uniform topology on the space 

C[—K, K], and let Y(t) = 7 iTa:( 721)) where by Theorem 6.4, we know that 



Let a := {Mxo)) and b := then by rescaling property of 

Brownian motion, we find that 7172 ^^ = R)7i72'*'^ = Solving for 71,72 
yields 


Let a := 


(6.13) 


2 fc +4 3 

= Qi2fc+l5 2fc+l 


2 2 
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On the other hand, by (4.3), let n —oo, we find that 
{gn{xo + Snt) - go{xo) - Sntgo{xo))\ 


\ —r 


EI(t) 


It is easy to see that = 7 i 72 ^-fffc( 72 i) nd = -fij^^Hk{'y 2 t). 

Now by substitution in (6.13) we get the conclusion by direct calculation and 
the delta method. □ 

Proof of Theorem 4.4. The proof is essentially the same as that of 
Theorem 3.6 Balabdaoui, Rufibach and Wellner (2009). □ 

Lemma 6.5. Assume (Al)-(A4)- Then 

rg^’'\mo) 


/ OO 

feix) 

-OO 


dx = 1 + TTfc 


g(mo)"+i 


+ o(e*'+^). 


where 


T^k = 


{k + l)\ 1 


3^-^{2k^ -4k+ 3)+ 2k^-1 


Proof of Lemma 6.5. This is straightforward calculation by Taylor ex¬ 
pansion. Note that 

/ OO roo 

57 '’(x) dx = / {g~''ix) - g~''{x)) dx -M 

-OO J —oo 


' —oo 
rmo—e 


• mo—Cee 

rmo+e 


"'{x) - g "'{x) dx 


+ 


f 


'mo—e 
■.= 1 + 11 + 1. 


9e ''{x) - g "'(x) dx-b 1 


For 7 / > X, we have X —y ^ ( n)(—i)^(g —x)"'g Now for the 
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first term above, we continue our calculation of its leading term by noting 


(6.15) 

g{x) - ge{x) 

= a{x) - g{mQ - c,e) - {x - niQ + c,e)5f'(mo - c,e) 


= fi'(mo) + -—- 


k\ 




— [x — mo + CeC)^-;-^(—Cee)^“^ + higher order terms 




k\ 


{k-l)\ 

{x - mo)^ - + kc’^^-^e’^-^x - mo + c,e) 


+ higher order terms. 


Here we used the fact k is an even number, as shown in Lemma 7.2. Thus 
we have 

leading term of I 

f’mo-e / \ 

= / ( 9{x) - g{mo - Cee) - [x - mo + Cee)g'{mo - c^e) j g(x)~'’~^ dx 

Jmo—Cee \ J 

{x - mo)^ - + kc’^^-h’^-\x - mo + c,e) dx + o(e'=+^) 


rg^^\mo) 

ImQ—c^e _ 


= 


klg{moY+^ 

rc/(*^)(mo)__fc+i . fc+i. 


gimoy+^ 


g/C+i ^^(g/O+i) 


Here 


afc = 


(fc + l)! L 


3fe-i(2A:^ -4A: + 3) - 1 


For the second term, 

(6.16) 

g{x) - ge{x) 

= g{x) - g{mo + e) - (x - mo - e)5r'(mo + e) 
g^^\rno) 


k\ 


(x — mo)* — e* — fee* ^(x — mo — e) 


+ higher order terms. 


Now similar calculations yield that the second term = j3k Y(moy+^ 

p(gfc+i) 


/3fc = 


2fe2 


(fe + 1)! 


This gives the conclusion. 


□ 
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Proof of Lemma 4.6. By definition of the Hellinger metric and Lemma 
6.5, we have 


/ CXD 

Wfe{x) - \/f{x)Y dx 

-CXD 


' —CXD 

r*cxD 


• —OO 

/*oo 


1 - 


TTfc rc/(^)(mo) j,+i fc+i 


2 

57'’/2(x)( 1 + r/fc(e)) - 5r"^/^(x)) dx 




Since 


-1 


Here r]k{e) = Splitting two terms apart in the above integral we 

get 

2/i^(/€, /) = y (^ 97 ''^^ix) - 9 ~'’^‘^{x) + r]kie)g-'’/^{x)^ dx 

/ CXD poo 

(57^/2 (x) - 5"’’/2 (x))^ dx + (%(e))^ / 57 '” (a;) dx 

-CXD J —CXD 

/ OO 

-OO 

= / + // + ///. 

Now for the first term, 


/ = 


f-mo+e J.2 


— [^(x) — ge{x)\^g{x) ^ ^ dx + higher order terms 


! mo—Cf,e 


rrriQ+e ^ 

/ \9{x) — 5 e(x)] dx + higher order terms 

J mc\—Cfe 


49(mo)^+2 


' mo—CeE 

^2 / j-mo-e r-mo+e\ ^ 

——-T-Tw 1 / + / 1 — 5 e(x)l dx + higher order terms 

4^(?7Xo)^ \ Jmn—Cee JmQ — e / 


'mQ—Cee JiriQ — 

= Ii + I 2 + higher order terms. 


dx 
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By (6.15) and (6.16) we see that for i = 1,2, 


h = 


Ag{mQY+‘^ 


[ [g{x) - ge{x)]‘^ dx 

Jii 


= c 


(i) r'^f{mo)g''^YmoY 


2k+l 


+ o(e^^+^). 


g{moY 

Here Ii = [niQ — c^e, niQ — e], X 2 = [mo — e, mo + e], and 


^(1) _ 
St — 


1 


- 4 • 3^+^i2k + 1)(3^+2 + k‘^ + k-3) 


l08{k\Y{k + l){k + 2){2k + l) 

+ {k + l){k + 2) (^27(3^'=+^ - 1) + 2 • 3^’^{2k + l){2k{2k - 9) + 27) 
2k^{2P + 1 ) 


^(2) ^ _ 

3ik\Y{k + l){2k + l)' 

On the other hand, II = = o(e^*^+^) and \III\ < 

g( 2 fc+ 2 )/ 2 ^ _ pj'g 2 fc+i^ i^y Cauchy-Schwarz. This completes the proof. □ 

7 . Appendix. 

7.1. Proofs of Lemmas 6.2 and 6. 3. 

Lemma 7.1. Let u be a probability measure with s-concave density f, 
and xq, ..., Xd G 6e d + 1 points such that A := conv({xo, ..., Xd}) is 
non-void. If f{xo) < (3 Ef=i/^(a^i))^^*, then 

-hi 


■-(A) 


where g := 3 Ej=i 


Proof of Lemma 7.1. For any point x € A, we can find some u = 
{ui,...,Ud) e Ad = {u : YA=i'^i ^ 1} such that x(u) = J2i=QUiXi. Here 
uo := 1 — Yli=i We use the following representation of integration 

on the unit simplex A,^: For any measurable function h : Ad —>■ [0, 00 ), we 
have h{u) du = ^E/i(Hi,..., Bd), where Bi = Ei/Y2’j=Q Ej with inde¬ 
pendent, standard exponentially distributed random variables Eq, ... ,Ed. 


KA) 

Ad(A) Arf(Arf) 


[ 9 (x{u)) ^du = Eg('S^B. 
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where Bi := for 1 < i < d. Following Cule and Diimbgen 

(2008), it is known that Bq and are independent, and = lld. 

Hence it follows from Jensen’s inequality that 


KA) 

Ad(A) 


> E 


> 


Ef-Boffo + (1 - Bo)'^Big{xi) 

. A i=i 

eTbo^o + (1 - Bo)^'^g{xi)\ 

A i=i A 

= E {Bogo + (1 - Bo)^^ 

= [ d{l - {tgo + {1 - dt 


Bn 


= 9-^ 


Jo 


1-St( (-1/s) { y - 1 




dt 


where ^ 

Jd,s{y) = [ ^(1 “ - syt^B dt. 

Jo 

We claim that 


Jd, 


S 



d{i-t)'^-\i-tydt 


d 

d + y' 


holds for s < 0,y > 0. To see this, we write (1 — syt^B = (i yfjr'j-{rly)y ^ 
Then we only have to show (1 + > (1 ~ ^) for 0 < t < 1, or 

equivalently (1 + bt) < (1 — t)~^ where we let b = yjr. Let gif) '■= (1 — 
— (1 + bt). It is easy to verify that g(0) = 0, g'(t) = 6(1 — — b 

with g''(0) = 0, and g''{t) = 6(6 + 1)(1 — > 0. Integrating g" twice 

yields g{t) > 0, and hence we have verified the claim. Now we proceed with 
the calculation 


^(A) 

Ad(A) 


> g-^Jd^s 



> 9~ 


d 


d 

TTaK 

s \ g 


!)■ 


Solving for g^ and replacing — 1/s = r proves the desired inequality. □ 


Proof of Lemma 6.2. For fixed j G {0,..., d}, note |det(xj — Xj) : i ^ j 


|detW| where X = 


XQ 

1 


Xd 

1 


. Also for each y G since A = 
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conv({xo, ■ ■ ■, Xd}) is non-void, y must be in the affine hull of A and hence 
we can write y = with necessary non-negative), 

i.e. A = Let Aj(y) := conv({xi : i 7^ j} U {?/}). Then 


Arf(Aj(y)) 


1 

det 

(xo . 

• Xj-i y 

Xj + l ■ 

• Xd\ 

d). 


U • 

1 1 

1 

• ly 


^ |Aj| |det A| = \Xj\ Arf(A). 


Hence, 


max Xd{Aj{y)) > Arf(A) max |Aj| =Ad(A)||A ^k)||oo 

Kj<d j \i/ 

> Arf(A)(d + l)-i/2||^-iQ|| 

> Xd{A){d + l)-VV^ax(X)-i(l + ||2/f )V2 = C{1 + ||2/f )V2. 


Now the conclusion follows from Lemma 7.1 by noting 


f{y) < 


—r(._d d >^d{Aj{y))gj ^ 

V iy{Aj{y)) 


< 


/maxfl - - + -/minC'(l -h , 

\ r r J 


since '’ = (3 f^{xi)Y^^ and hence /mm <gj''< /max, and the index 

j is chosen such that Xd{Aj{y)) is maximized. □ 


Proof of Lemma 6.3. The key point that for any point x E B{y, 6t) 
B{ty, 6t) C (1 - t)B{0, 6) + tx 


can be shown in the same way as in the proof of Lemma 4.2 Schuhmacher, Hiisler and Diimbgen 
(2011). Namely, pick any w E B{ty,6t), let v := {1 —1)~^{w — tx), then since 

||u|| = (l-t)“^||rt;-tx|| = {l-t)~^\\w-ty+t{y-x)\\ < {6t+t6t) = S, 

and hence v E B{0, 6). This implies that w = {l—t)v+tx E (1—t)i3(0, 6)+tx, 
as desired. By s-concavity of /, we have 

f{w)>{il-t)fiv)^ + tfix)f^^ 

> {{1 -t)J^ + tf{x)^)^^" 
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Averaging over w G B{ty,6t) yields 


u{B{ty,6t)) 

Xd{B{ty,6t))-''^ 


— t + t 



Solving for f{x) completes the proof. 


l/s 


□ 


7.2. Proof of Theorem 6.4- We first observe that 

Lemma 7.2. k is an even integer and gQ^\xo) > 0. 

Proof of Lemma 7.2. By Taylor expansion of gQ around xq, we find 
that locally for x ^ xo, 

(^) f 'i 

5o(t) = (fc _^2)i ~ ^ “ xo)^~^). 

Also note g'^ix) > 0 by convexity and local smoothness assumed in (A3). 
This gives that A; — 2 is even and gQ^\xo) >0. □ 

For further technical discussions, we denote throughout this subsection 

fc+2 1 

that for fixed A, := n^f^+^;Sn ■= n '^^+^]Xn{t) := xq + Snt]ln,xo ■= 
[xo,Xn(t)]. Let T+ := inf{t G Sn{gn) ■ t > xo}, and t~ := sup{t G Sn{gn) '■ 
t < xq}. The key step in establishing the limit theory, is to establish a 
stochastic bound for the gap — t~ as follows. 


Theorem 7.3. Assume (A1)-(A4) hold. Then 

Tn -Tn = OpiSn). 


Proof. Define Ao(x) 



:= ('Tn - t)1[^-_-](x) + (x - r+)l[-^^+](x), and 
, where r =: . Thus we find that 


Ai d(F„ — Fq) 


j Ai d(F 


n 


> - 




Fo) 

Al(/n 


/o) dA 


> -- 


T — T 

‘ n ‘ n 


2n 


+ 


Ai(/n-/o) dA, 


where the last line follows from Corollary 2.13. Now let Rin ■= f Ai(/„ — 
/o) dA,i? 2 n := f Ai d(F„ — Fq). The conclusion follows directly from the 
following lemma. □ 











50 


HAN AND WELLNER 


Lemma 7.4. Suppose (Al)-(A4) hold. Then Rin = Op{T^ — and 

R2n = Op{r-^). 

Proof of Lemma 7.4. Define pn '■= Onlgo on It is easy to see 

that — T~ = Op(l), so with large probability, for all n G N large enough, 
™4e[r+,ry] fo{x) > 0 by (A2). 

Rin = J Ai(x)(/„(x) - fo{x)) dx = J Ai(x)/o(x)^^|^ - dx 
= J - 1 )^ + (^ j^'^^x^n~^iPn{x) - 1 )'=^ 

where 9x,n G [1 A 1 V f^]- Now define 


Snj = j Ai(x)/o(x)^ 

Snk= Ai(x)/o(x) 

J Tn 

Expand /o around r, then we have 



{Pn{x) - ly dx,l < j < k - 1, 

^yyy’^ipnix) - ly dx. 


fc — 1 /.•T + 


Snj — 


E f: A.( 

1 — n T-n 

fTn 

Ai(x) 
J 


x) (x — xY ( ^ (Pnix) — ly dx 


1=0 

+ 


l\ 

fo\vn,x,k) 





k\ 



(x-x) ( ^ )(pn(a;)-l) dx. 


Snk = J 2 j_ ^{x - tY ^ {Pn{x) - ly dx 

/o \vn,x,k) „-r-kf 


<^d) l=\ 


1=0 ■ 


fTn 

Ai(x)- 

J Tm 


k\ 



oyy-yx-ffi , (p„(x)-irdx. 


dx. 


Now we see the dominating term is the first term in Sni since all other terms 
are of higher orders, and \9x,n — 1| = Op(l) uniformly locally in x in view of 
Theorem 3.7. We denote this term Qni- Note that l/go{xo) = 1 /g^^)+Op{l) 
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uniformly in r around xq, and that Qn is piecewise linear, yielding 


Qnl 


1 


-rfoir) 


^iix)—r^{gn{x) - go{x)) dx 
go{x) 


1 


50(to) 

1 


+ Op{l) 

+ Op(l) 


I ^i{x){gn{x) - go{x)) dx 

Tn 

/ ‘ n 

{gnif) - goif)) / Ai(x) dx 


50(to) 

/■Tn 

+ (s'Jf) - s'a(f)) / Ai(a^)(a! - f) Ax 


^ 9^ r- 


Ai(x)(x — tY dx 


i=2 


en(x)Ai(x)(x — tY dx 


where the first two terms in the bracket is zero by construction of Ai. Now 
note that 

rTn 

/ Ai(x)(x—r)-^ dx = 


1“ 


i^n - T„ 3 > 0, and 3 i 


j = 0, or 3 is odd; 

is even, 


and that 5 o'^^(r) = (xq) + Op(l))(r — xq)^ b This means that for 

j >2 and j even, 

- .y ci. = - x^r^ir: - .„-y« 

^ 3{go\xo) + Op{l)) , _ fe+2 

Further note that ||en||oo = Op(l) as r+ — r“ —0, we get Qni = Op(T^ — 
T~ This establishes the first claim. The proof for R 2 n follows the same 
line as in the proof of Lemma 4.4 Balabdaoui, Rufibach and Wellner (2009) 
pl318-1319. □ 


Lemma 7.5. We have the following: 

fo\xo) =f-(^ ^.^^5o(To)"'’"^(5o(To))^ 1 <j<k-l; 
fiy\xo) = k!^ j^^^go(xo)~’'~^(go(xo))^ - t 5 o ( to )"''"^ 5 o ^^( to )- 
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Proof. This follows from direct calculation. □ 

Lemma 7.6. For any M > 0, we have 

sup {g'nixo + Snt) - 9o(xo)| = 

\t\<M 

sup \gn{xo + Snt) - go{xo) - Sntc/o(^o)| = Op{Sn). 

\t\<M 

The proof is identical to Lemma 4.4 in Groeneboom, Jongbloed and Wellner 
(2001) so we shall omit it. 

Lemma 7.7. Let 


en{u) := fn{u) - - xo)^ - /o(xo) ^ ~ 

Then for any M > 0, we have sup|j|<;;^ \en{xo + Snt)| = Op{s^). 


Proof. Note that 


(7.1) 

fn{u) - Mxo) 





Define i'k,n,i{u) := Ej>fc+i ( /) " l)' = Ej>k+i ( 

go{xo)y ■ Note that 

{9n{u) - go{xo)y = {gn{u) - goixo) -{u- xo)5o(a;o) + {u- xo)5o(a:o))^ 

= Y\ \) [9n{u) -goixQ) -{u- xo)goixo)y{u - xoy~^gQ{xo)^ 

+ {u- xo)^g'o{xoy 

= Opisy-si-^) + Opisi) 

uniformly on {u : |u — xq\ < 

= Op(n“^), 
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if j > A; + 1. Here the third line follows from Lemma 7.6. This implies 
'^k,n,liu) = Op(n~ 2 fc+i uniformly on {u : |n — xqI < Using 

the same expansion in the first term on the right hand side of (7.1), we arrive 
at 


fn{u) - foixo) 


( 1 ) 


/ o ( to ) [go{xo)p ^ {x) ~ -(u- xo)go(xo)]''(u - xq )^ '’ 50 ( 3 : 0 )^" 

'- - -V- 

( 2 ) 



Xoy +fo(xo)'^k,n,l(^) ■ 

' -V-" 

_^ (4) 


By Lemma 7.5, we see that en{u) = (1) — (3) = (2) + (4) = Op{sy) uniformly 
on {tt : |u — xqI < This yields the desired result. □ 


We are now ready for the proof of Theorem 6.4. 
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Proof of Theorem 6.4. For the first assertion, note that 

r0)( 

j! 


[foixo)\ 

^ j=0 

--[Mxo)]~^ (fn{u) - fo{xo) - ^ - Xoy 


[/o(a;o)] ^(fo{xo)('^ 


i=i 

'-A {Sn(u) _ g 



V i / \9oixo) 


j=i 


by (7.1) 


-^k,n,l{u) + 

j=l 





j J \9o{xo) 


i=i 




-x\ ( 9n{n) 



go{xo) 


-1 - 


fo{XQ) 


/o(xo)(n - xo) 


+ E 

1=2 


( 9n{u) 





1=2 


j y \9q{xq) 

9n{u) - go(xo) - 9oM(u - xq)^ ^ ( /) ( goixo) ~ 0 


go(xo} 


1=2 


iMx,r‘j:^^(u-xoy 


1=2 


£/o(a;o) 

where 


5 n(R) - 50(2:0) - 5 o( 2 :o)(R - 2:0) ) + '^k,n, 2 (u), 



1=2 
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Now we calculate 

r . 

^ k,n,2{'^)dudv 


ln,XQ ^0 

1 o_ 2 

=-t n 2 fc+i sup 


'^k,n,l{u) + 

i=2 


—r 

j J h 





iMxo)] 




i=2 


ln,XQ XQ 


n,XQ 1^0 

(u — xqY dudu 


9o{xo) 


= 0 , 


i=2 
k-l 

-E 

i=2 

^i=2 




go(a:^o) 

jJ\9o{xo)J Ji 



{u — xqY dudu 


n,XQ 


fXQ 




9o{xo) 

j J y9o{xo)J Ji 



{u — xqY dudv 


n,XQ ^0 





1 


j J [9 o{xo)¥ 


=Op{rn ) + 

^i=2 


ln,XQ *^ 12^0 l = \ 

-l^ , (9'oixo) 


^ 01 i9n{u) - 50 (^ 0 ) - 5'o(to)(^^ - xq))\u - xoY \9'f){xQ)Y ' dudv 


(T)(i^) I, 


—r\ 1 



j J bo(To)]^' 


'ln,XQ XQ 


EOlfe n{u) - go{xo) - 9 'o{xo){u - xo))\u - xoY '[ 50 (^ 0 )]^ ' dudv'] 


:—Op{r^ + (2) + (1). 

Consider (1); for each {j,l) satisfying 1 < I < j < k and j >2, we have 
/» /»-?? 

{gn{u) - 50(to) - 5o(to)(w - To)) {u - xoy~Ygo{xo)y~’' dudv 


(1) : rr, 


ln,XQ Xq 


k+2 2 kl j-l k(l-l) + (j-l) 

= n 2 k+i ■ 0{n 2 fc+i) . Op{n 2 ^+ 1 ) • Op{n 2 fc+i) = Op{n 2^+1 ) = Op(l). 
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Consider (2) as follows: 


( 2 ) = 



dtidz; 


1 

(k + l)(k + 2)[kJ[go(xo}J 


Hence we have 


/ f 

'^ln,XQ '^ 3^0 


^fc,n,2(«)d?rdt> 


1 /^-A ( 9o{xo) 

{k + l){k + 2) \ k J \go{xo) 


f^+2 + Op(l). 


Note by definition we have 


(7.2) ^ ^ 7 r 

Jo{XO> Jxo 

Let n ^ oo, by the same calculation in the proof of Theorem 6.2 Groeneboom, Jongbloed and Wellner 
(2001), we have 


Y 


locmod 


(0 -^d 


VMxo) 


fw{s) 

Jo 


ds 


+ 


_ ( 9o{xo) \ 

{k + 2)\fo{xo) {k + l){k + 2) \ k J \go{xo) J 


A^\xo) 


I »F(.) d* - 

\/Mxo) Jo 9 o{xo){k + 2)\ 


k-\ 


±k-\-2 


xk-\-2 


where the last line follows from Lemma 7.5. Now we turn to the second 
assertion. It is easy to check by the definition of 'I'A:,n, 2 (') that 


(7.3) M“(t) = -rn [ r 4 'k,n, 2 {^)dudv. 

JO[Xo) Jxo 

On the other hand, simple calculation yields that (t) —(t) = rn(lHI„(a;o+ 
Snt) — Hn{xo + Snt)) > 0 where the inequality follows from Theorem 2.12. 
Combined with (7.2) and (7.3) we have shown the second assertion. Finally 
we show tightness of {An} and {Bn}- By Theorem 7.3, we can find M > 0 
and r G S{gn) such that 0 < r — xq < with large probability. 
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Now note 


A, 






{Fnixo) - Fnir)) - (F„(xo) - Fn(T)) 


+ 


TnSr, 


n 


'XQ 




j=0 


+ TnSr 


" 'f.^^{n-x„y-h{u)]dn 


^0 s=0 

r d{¥. 

J xn 


n Fq 


F '^n^n 

XQ 

= • Anl + An2 + AnS + n ^7(2^+!) 
We calculate three terms respectively. 


+ (2fc+i) 


Anl ^ f’nSn 


pT pT 

/ en(tt) du +rnSn / /o(to) 

JxQ 


XQ 

= Op{rnSn ■ Sn'^^) + Op{rnSn ' = Op{l), by Lemma 7.7 

ft\xo) 


—r 


.;(S3) 


^n2 ^ fnSn 


f 

^XQ 


k\ 


-{u — xqY du 


+ TnSr. 


{u - XoPCniu) du 


' XQ 


= Op(l), since ||e; 


n||oo 


0 as xo — T —>■„ 0. 


For An 3 , we follow the lines of Lemma 4.1 Balabdaoui, Rufibach and Wellner 
(2009) again to conclude. Fix R > 0, and consider the function class Fxq^r ■= 
{\xo,y] ■■ Xo < y < XQ + R}. Then Fxo,r{z) := l[a:o,a;o+A](^) is an envelop 
function for Fxo,Ri and = fx°^^ dz = R. Now let s = /c, d = 1 in 

Lemma 4.1 Balabdaoui, Rufibach and Wellner (2009), we have 


An3 — 



Fo){z) 


T I 1 rC-f-1 

< |t - + Op(l)n“2FFT 


Op{l). 


This completes the proof for tightness for {An}- {Bn} follows from similar 
argument so we omit the details. □ 


7.3. Auxiliary convex analysis. 

Lemma 7.8 (Lemma4.3, Diimbgen, Samworth and Schuhmacher (2011)). 
For any ip{-) G Q with non-empty domain, and e > 0, define 

(p^^\x) := sup(u^x + c) 

(v,c) 
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where the supremum is taken over all pairs of {v, c) € x M sueh that 

Ibll < 7/ 

•2. <piy) > v'^y + c holds for all y E 
Then E Q with Lipschitz constant Furthermore, 

Z’ as e \ 0, 

where the eonvergenee is pointwise for all x E 

Lemma 7.9 (Lemma 2.13, Diimbgen, Samworth and Schuhmacher (2011)). 
Given Q & Qo, a point x E is an interior point of csupp{Q) if and only 

if 

h{Q,x) = sup{Q(C') : C C closed and convex, x ^ int(C')} < 1. 
Moreover, if {Qn} C Q converges weakly to Q, then 

limsup/i((5n,a^) < h{Q,x) 

n^oo 

holds for all x E 

Lemma 7.10. If g ^ Q, then there exists a,b > 0 such that for all x E 
g{x) > a||x|| — b. 

Proof. The proof is essentially the same as for Lemma 1, Cule and Samworth 
(2010), so we shall omit it. □ 

Consider the class of functions 

Gm ■= ^ G ■■ j g^ dx < . 


Lemma 7.11. For a given g E Gm, denote Dr := D{g,r) := {g < r} to 
be the level set of g{-) at level r, and e := inf^f. Then for r > e, we have 


X{Dr) < 


M{—s){r — eY 
(s + 1) fg ^ v‘^{v + e)^/® dx 


where /3 = 1 + 1/s, and —1 < s < 0. 
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Proof. For u G [e,r], by convexity of g{-), we have 
X{Du) > (^) 

This can be seen as follows: Consider the epigraph F^ of g{-), where F^ = 
{{t,x) G X M ; x > g{t)}. Let xq G be a minimizer of g. Consider 
the convex set Cr = conv(Fg n {s' = r}, {xq, e)) C Fg n {s < r}. where the 
inclusion follows from the convexity of F^ as a subset of The claimed 

inequality follows from 

>>d{Du) = Arf(7rrf(Fgn{5' = u})) > Xd{Trd{Cr<^{g = u})) = Xd{Dr), 

where tt^ : x R —)■ R*^ is the natural projection onto the first component. 

Now we do the calculation as follows: 


M> dx 

J Dr 

= — ^- + y l(u > g{x))u^^^ du^ dx 

= — ^- + y duy l(u > g{x)) dx 

= - Q + y X{Du)u^^'' du 

X{Dr)u^^^ du 


> - ( - + 1 
s 


f‘T / \ d 

' ' u — e 


r — € 


= X{Dr 


(s + 1) D(u — du 


{-s){r-e)^ 

By a change of variable in the integral we get the desired inequality. 


□ 


Lemma 7.12. Let G be a eonvex set in R'^ with non-empty interior, 
and a sequence {ynjneN with \\yn\\ —^ co as n —)• oo. Then there exists 
{xi,... ,Xd} C G such that 

Arf(conv (xi,.. .,Xd,yn(k ))) oo, 
as k ^ oo where {2/n(fc)}fceN is a suitable subsequence o/{yn}neN- 
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Proof. Without loss of generality we assume 0 € int(dom(G)), and we 
first choose a convergence subsequence {yn(fc)}fceN from {r/n/lll/nll}neN- Now 
if we let a ;= lim^^oo yn(fc)/ll2/n(fc) II > then ||a|| = 1. Since G has non-empty 
interior, {a^x = 0}nG has non-empty relative interior. Thus we can choose 
xi ,..., Xrf C {a^x = 0} n G such that \d-i{K) = Xd-i (conv (xi,..., Xd )) > 

0. Note that 

dist (y„(fc),aff(iA)) = dist {yn{k),{a^x = 0}) = {yn{k),a) = ||?/n(fc)ll(2/n(fc)/ll2/n(fc)ll,a) ^ oo, 
as A: —)• oo. Since 


Arf(conv (xi,.. .,Xd,yn{k ))) = Arf(conv {K,yn(k ))) = cXd-i{K)-dist {yn{k),^S{K)) , 
for some constant c = c{d) > 0, the proof is complete as we let k ^ oo. □ 


Lemma 7.13 (Lemma 4.2, Diimbgen, Samworth and Schuhmacher (2011)). 
Let g and {gnlneN be functions in Q such that Qn > g, for all n G N. Suppose 
the set C := {x G : limsup^^oo (^^(x) < oo} is non-empty. Then there 
exist a subsequence {gn{k)}k&fi of {gn}neN, ond a function g G G such that 
C C dom( 5 ) and 


(7.4) 


, lim gn(k){x) = g{y), for all y G mt{dom.{g)), 

k—^(yD,x^y 

liminf gn(k){x) > g{y), forallyGM^. 

k—^(yD,x^y 


Lemma 7.14. Let {gn} be a sequence of non-negative convex functions 
satisfying the following conditions: 

(Al). There exists a convex set G with non-empty interior such that for all 
xq G int(G), we have sup^gjsj ^'^(iro) < oo. 

(A2). There exists some M > 0 such that sup^gj^ f [gn (x))^ dx < M < oo. 

Then there exists o, 6 > 0 such that for all x G and k gN 


9n{k){x) > a\\x\\ - b, 

where {gn{k)}k&n is a suitable subsequence of {gn}nef>i- 

Proof. Without loss of generality we may assume G is contained in all 
int(dom(grn))- We first note (Al)-(A2) implies that {xn G Argminj,g]gd gnix)}^=i 
is a bounded sequence, i.e. 


sup||x„|| < oo, 

nSN 


(7.5) 
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Suppose not, then without loss of generality we may assume ||xn|| —5- oo 
as n ^ oo. By Lemma 7.12, we can choose {xi,...,Xd} C G such that 
Ad(conv (xi,..., Xd, Xn(k)) ) oo, as A: —>■ oo for some subsequence {xn(fc)} C 
{xn}- For simplicity of notation we think of {xn} as such an appropriate sub¬ 
sequence. Denote := in^g^d gn{x), and M 2 := sup^gj^ < sup^gf^ gnixo) < 
00 by (Al). Again by (Al) and convexity we may assume that 

sup ^ gn{x) < Ml, 

a;econv(a:i,...,Xd,a;„) 

holds for some Mi > 0 and all n G N. This implies that 

I g^(x) dx>MfArf(conv(xi,...,Xrf,xO) ^ 00 , 


as n —>■ 00 , which gives a contradiction to (A2). This shows (7.5). 

Now we define g(-) be the convex hull of g(x) := inf^gN gn{x), then g < gn 
holds for all n G N. We claim that g(x) —>■ 00 as ||x|| —>■ 00 . By Lemma 7.11, 
for fixed g > 1, we have 


>^d{D{gn,gM2)) 


^ M{-s){gM2-enf 

(s -h 1) du 

^_ M{-s){gM2Y _ 

(s -I- 1) _j_ du 


< 00 , 


where D{gn,gM 2 ) := {gn < gM 2 }. Hence 


( 7 . 6 ) supAd(D(5„,?7M2)) < 00. 

nSN 


holds for every g > 1. Now combining (7.5) and (7.6), we claim that, for 
fixed g large enough, it is possible to find R = R{g) > 0 such that 


(7.7) 


gn{x) > gM2, 


holds for all x > R{g) and n G N. If this is not true, then for all /c G N, we 
can find n{k) G N and Xk G with ||xfc|| > k such that gn{k){xk) < gM 2 . 
We consider two cases to derive a contradiction. 

[Case 1 .] If for some no G N there exists infinitely many A: G N with n{k) = 
no, then we may assume without loss of generality that we can find some a 
sequence {xk}keN with ||xfc|| 00 as k ^ 00 , and gnoixk) < gM 2 . Since the 

support guQ has non-empty interior, by Lemma 7.12, we can find xi,..., G 
supp(5rio) such that Ad(conv(xi,..., x^, x^q))) — 00 as j —)• 00 holds for 
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some subsequence of {xfcjfcgN- Let M := maxi<j<rf then 

we find Xd{D{gno,M V r/M 2 )) = oo. This contradicts with (7.6). 

[Case 2.] If ^{k € N : n = n{k)} < oo for all n G N, then without 
loss of generality we may assume that for all A; G N, we can find Xk G 
with lixfcll > k such that gk{xk) < gM 2 . Recall by assumption (Al) convex 
set G has non-empty interior, and is contained in the support of gn for 
all n G N. Again by Lemma 7.12, we may take xi^... ,Xd G C such that 
Ad(conv(xi,..., Xd, —)• oo as j oo holds for some subsequence 

{xk{j)}jeN of {xfcjfceN- In view of (Al), we conclude by convexity that M := 
maxi<j<dsupjgj^g'fc(j)(xj) < oo. This implies 

>^d{D{gn^,^j^,M V gM 2 )) > Ad(conv(xi,.. .,Xd,Xk(j))) ^ oo, j ^ oo, 
which gives a contradiction. 

Combining these two cases we have proved (7.7). This implies that g{x) 
oo as ||x|| —oo, whence verifying the claim that g{x) —)• oo as ||x|| —oo. 
Hence in view of Lemma 7.10, we find that there exists a, 6 > 0 such that 
9n{x) > a||a^|| — b holds for all x G and n G N. □ 

Lemma 7.15. Assume xq, ..., x^ G are in general position. If g{-) is a 
non-negative function with A = conv(xo,... ,Xd) C dom(g'), and g{xo) = 0. 
Then for r > d, we have (g(x)) ^ dx = oo. 


Proof. We may assume without loss of generality that xq = 0, x* = e* G 
where Oj is the unit directional vector with 1 in its i-th coordinate and 
0 otherwise. Then A = Aq := {x G : Yli=i Xi < l,Xi > 0,'^i = 1,..., d}. 
Denote a* = g{xi) > 0. We may assume there is at least one Oi ^ 0. Then 
by convexity of g we find g{x) < Yli=i nil x G Aq. This gives 


f {g{x)) '' dx > / (^ OiXi) '' dx> f 
J Aq J Aq J Aq 


(maxi=i,...,rfai)'’||x||5; 


dx 


> 


(maxi=i_,,,_dOj)^d^/2 Jc-Q ||x||^ 


dx = oo. 


where Cq := {||x ||2 < -^} C {x* > 0,i = 1 ,... ,d}. Note we used the fact 
that ||x||i < \/d||x|| 2 . □ 


Lemma 7.16 (Theorem 1.11, Bhattacharya and Ranga Rao (1976)). Let 
fn -^d f, o,nd V be the class of all Borel measurable, convex subsets in 
Then lim^^oo sup^jgi, |/j)(/n - /)| = 0. 
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